Data Structures RPI Spring 2017 Lecture Notes
Data Structures RPI Spring 2017 Lecture Notes
Today
Discussion of Website & Syllabus:
http://www.cs.rpi.edu/academics/courses/spring17/ds/
Getting Started in C++ & STL, C++ Syntax, STL Strings
These days, the process of compilation is almost instantaneous for simple programs, and in this course we
encourage you to follow the same incremental editing & frequent testing development strategy that is employed
with interpreted languages.
Finally, many interpreted languages have a Just-In-Time-Compiler (JIT) that can run an interpreted program-
ming language and perform optimization on-the-fly resulting in program performance that rivals optimized
compiled code. Thus, the dierences between compiled and interpreted languages are somewhat blurry.
You will practice the cycle of coding & compilation & testing during Lab 1. You are encouraged to try out
dierent development environments (code editor & compiler) and quickly settle on one that allows you to be
most productive. Ask the your lab TAs & mentors about their favorite programming environments! The course
website includes many helpful links as well.
As you see in todays handout, C++ has more required punctuation than Python, and the syntax is more
restrictive. The compiler will proofread your code in detail and complain about any mistakes you make. Even
long-time C++ programmers make mistakes in syntax, and with practice you will become familiar with the
compilers error messages and how to correct your code.
1.3 A Sample C++ Program: Find the Roots of a Quadratic Polynomial
#include <iostream> // library for reading & writing from the console/keyboard
#include <cmath> // library with the square root function & absolute value
#include <cstdlib> // library with the exit function
// Returns true if the candidate root is indeed a root of the polynomial a*x*x + b*x + c = 0
bool check_root(int a, int b, int c, float root) {
// plug the value into the formula
float check = a * root * root + b * root + c;
// see if the absolute value is zero (within a small tolerance)
if (fabs(check) > 0.0001) {
std::cerr << "ERROR: " << root << " is not a root of this formula." << std::endl;
return false;
} else {
return true;
}
}
/* Use the quadratic formula to find the two real roots of polynomial. Returns
true if the roots are real, returns false if the roots are imaginary. If the roots
are real, they are returned through the reference parameters root_pos and root_neg. */
bool find_roots(int a, int b, int c, float &root_pos, float &root_neg) {
// compute the quantity under the radical of the quadratic formula
int radical = b*b - 4*a*c;
// if the radical is negative, the roots are imaginary
if (radical < 0) {
std::cerr << "ERROR: Imaginary roots" << std::endl;
return false;
}
float sqrt_radical = sqrt(radical);
// compute the two roots
root_pos = (-b + sqrt_radical) / float(2*a);
root_neg = (-b - sqrt_radical) / float(2*a);
return true;
}
int main() {
// We will loop until we are given a polynomial with real roots
while (true) {
std::cout << "Enter 3 integer coefficients to a quadratic function: a*x*x + b*x + c = 0" << std::endl;
int my_a, my_b, my_c;
std::cin >> my_a >> my_b >> my_c;
// create a place to store the roots
float root_1, root_2;
bool success = find_roots(my_a,my_b,my_c, root_1,root_2);
// If the polynomial has imaginary roots, skip the rest of this loop and start over
if (!success) continue;
std::cout << "The roots are: " << root_1 << " and " << root_2 << std::endl;
// Check our work...
if (check_root(my_a,my_b,my_c, root_1) && check_root(my_a,my_b,my_c, root_2)) {
// Verified roots, break out of the while loop
break;
} else {
std::cerr << "ERROR: Unable to verify one or both roots." << std::endl;
// if the program has an error, we choose to exit with a
// non-zero error code
exit(1);
}
}
// by convention, main should return zero when the program finishes normally
return 0;
}
2
1.4 Some Basic C++ Syntax
Comments are indicated using // for single line comments and /* and */ for multi-line comments.
#include asks the compiler for parts of the standard library and other code that we wish to use (e.g. the
input/output stream function std::cout).
int main() is a necessary component of all C++ programs; it returns a value (integer in this case) and it
may have parameters.
{ }: the curly braces indicate to C++ to treat everything between them as a unit.
Each statement may be a single statement, such as the cout statement above, a structured statement, or a
compound statement delimited by {. . .}.
3
1.9 Functions and Arguments
Functions are used to:
Break code up into modules for ease of programming and testing, and for ease of reading by other people
(never, ever, under-estimate the importance of this!).
Create code that is reusable at several places in one program and by several programs.
Each function has a sequence of parameters and a return type. The function prototype below has a return
type of bool and five parameters.
The order and types of the parameters in the calling function (the main function in this example) must match
the order and types of the parameters in the function prototype.
The first first three parameters to this function are value parameters.
These are essentially local variables (in the function) whose initial values are copies of the values of the
corresponding argument in the function call.
Thus, the value of my_a from the main function is used to initialize a in function find_roots.
Changes to value parameters within the called function do NOT change the corresponding argument in
the calling function.
The final two parameters are reference parameters, as indicated by the &.
Reference parameters are just aliases for their corresponding arguments. No new objects are created.
As a result, changes to reference parameters are changes to the corresponding variables (arguments) in
the calling function.
In general, the Rules of Thumb for using value and reference parameters:
When a function (e.g., check_root) needs to provide just one simple result, make that result the return
value of the function and pass other parameters by value.
When a function needs to provide more than one result (e.g., find_roots, these results should be returned
using multiple reference parameters.
Well see more examples of the importance of value vs. reference parameters as the semester continues.
expr1 is the initial expression executed at the start before the loop iterations begin;
expr2 is the test applied before the beginning of each loop iteration, the loop ends when this expression
evaluates to false or 0;
expr3 is evaluated at the very end of each iteration;
statement is the loop body
Here is the basic form of a while loop:
while (expr)
statement;
expr is checked before entering the loop and after each iteration. If expr ever evaluates the false the loop is
finished.
4
1.12 C-style Arrays
An array is a fixed-length, consecutive sequence of objects all of the same type. The following declares an array
with space for 15 double values. Note the spots in the array are currently uninitialized.
double a[15];
The values are accessed through subscripting operations. The following code assigns the value 3.14159 to
location i=5 of the array. Here i is the subscript or index.
int i = 5;
a[i] = 3.14159;
1.13 Python Strings vs. C chars vs. C-style Strings vs. C++ STL Strings
Strings in Python are immutable, and there is no dierence between a string and a char in Python. Thus, a
and "a" are both strings in Python, not individual characters. In C++ & Java, single quotes create a character
type (exactly one character) and double quotes create a string of 0, 1, 2, or more characters.
A C-style string is an array of chars that ends with the special char \0. C-style strings (char* or char[])
can be edited, and there are a number of helper functions to help with common operations. However...
The C++-style STL string type has a wider array of operations and functions, which are more convenient
and more powerful.
5
Strings define a special type string::size_type, which is the type returned by the string function size()
(and length()).
The :: notation means that size type is defined within the scope of the string type.
string::size_type is generally equivalent to unsigned int.
You may see have compiler warnings and potential compatibility problems if you compare an int variable
to a.size().
This seems like a lot to remember. Do I need to memorize this? Where can I find all the details on string objects?
*******
* *
* B *
* o *
* b *
* *
*******
#include <iostream>
#include <string>
int main() {
std::cout << "What is your first name? ";
std::string first;
std::cin >> first;
const std::string star_line(first.size()+4, '*');
std::string middle_line = "*" + std::string(first.size()+2,' ') + "*";
std::cout << '\n' << star_line << '\n' << middle_line << std::endl;
// Output the interior of the greeting, one line at a time.
for (unsigned int i = 0; i < first.size(); ++i ) {
// Create the output line by overwriting a single character from the
// first name in location i+2. After printing it restore the blank.
middle_line[ i+2 ] = first[i];
std::cout << middle_line << '\n';
middle_line[ i+2 ] = ' ';
}
std::cout << middle_line << '\n' << star_line << std::endl;
return 0;
}
6
CSCI-1200 Data Structures Spring 2017
Collaboration Policy & Academic Integrity
iClicker Lecture exercises
Responses to iClicker lecture exercises will be used to earn incentives for the Data Structures course. Dis-
cussion of collaborative iClicker lecture exercises with those seated around you is encouraged. However, if
we find anyone using an iClicker that is registered to another individual or using more than one iClicker, we
will confiscate all iClickers involved and report the incident to the Dean of Students.
Academic Integrity for Exams
All exams for this course will be completed individually. Copying, communicating, or using disallowed
materials during an exam is cheating, of course. Students caught cheating on an exam will receive an F in
the course and will be reported to the Dean of Students for further disciplinary action.
Collaboration Policy for Programming Labs
Collaboration is encouraged during the weekly programming labs. Students are allowed to talk through and
assist each other with these programming exercises. Students may ask for help from each other, the graduate
lab TA, and undergraduate programming mentors. But each student must write up and debug their own
lab solutions on their own laptop and be prepared to present and discuss this work with the TA to receive
credit for each checkpoint.
As a general guideline, students may look over each others shoulders at their labmates laptop screen
during lab this is the best way to learn about IDEs, code development strategies, testing, and debugging.
However, looking should not lead to line-by-line copying. Furthermore, each student should retain control of
their own keyboard. While being assisted by a classmate or a TA, the student should remain fully engaged on
problem solving and ask plenty of questions. Finally, other than the specific files provided by the instructor,
electronic files or file excerpts should not be shared or copied (by email, text, Dropbox, or any other means).
Homework Collaboration Policy
Academic integrity is a complicated issue for individual programming assignments, but one we take very
seriously. Students naturally want to work together, and it is clear they learn a great deal by doing so. Getting
help is often the best way to interpret error messages and find bugs, even for experienced programmers.
Furthermore, in-depth discussions about problem solving, algorithms, and code efficiency are invaluable and
make us all better software engineers. In response to this, the following rules will be enforced for programming
assignments:
Students may read through the homework assignment together and discuss what is asked by the assign-
ment, examples of program input & expected output, the overall approach to tackling the assignment,
possible high level algorithms to solve the problem, and recent concepts from lecture that might be
helpful in the implementation.
Students are not allowed to work together in writing code or pseudocode. Detailed algorithms and
implementation must be done individually. Students may not discuss homework code in detail (line-
by-line or loop-by-loop) while it is being written or afterwards. In general, students should not look
at each others computer screen (or hand-written or printed assignment design notes) while working
on homework. As a guideline, if an algorithm is too complex to describe orally (without dictating
line-by-line), then sharing that algorithm is disallowed by the homework collaboration policy.
Students are allowed to ask each other for help in interpreting error messages and in discussing strategies
for testing and finding bugs. First, ask for help orally, by describing the symptoms of the problem. For
each homework, many students will run into similar problems and after hearing a general description
of a problem, another student might have suggestions for what to try to further diagnose or fix the
issue. If that doesnt work, and if the compiler error message or flawed output is particularly lengthy,
it is okay to ask another student to briefly look at the computer screen to see the details of the error
message and the corresponding line of code. Please see a TA during office hours if a more in-depth
examination of the code is necessary.
Students may not share or copy code or pseudocode. Homework files or file excerpts should never be
shared electronically (by email, text, LMS, Dropbox, etc.). Homework solution files from previous years
(either instructor or student solutions) should not be used in any way. Students must not leave their
code (either electronic or printed) in publicly-accessible areas. Students may not share computers in
any way when there is an assignment pending. Each student is responsible for securing their homework
materials using all reasonable precautions. These precautions include: Students should password lock
the screen when they step away from their computer. Homework files should only be stored on private
accounts/computers with strong passwords. Homework notes and printouts should be stored in a
locked drawer/room.
Students may not show their code or pseudocode to other students as a means of helping them. Well-
meaning homework help or tutoring can turn into a violation of the homework collaboration policy
when stressed with time constraints from other courses and responsibilities. Sometimes good students
who feel sorry for struggling students are tempted to provide them with just a peek at their code.
Such peeks often turn into extensive copying, despite prior claims of good intentions.
Students may not receive detailed help on their assignment code or pseudocode from individuals outside
the course. This restriction includes tutors, students from prior terms, friends and family members,
internet resources, etc.
All collaborators (classmates, TAs, ALAC tutors, upperclassmen, students/instructor via LMS, etc.),
and all of the resources (books, online reference material, etc.) consulted in completing this assignment
must be listed in the README.txt file submitted with the assignment.
These rules are in place for each homework assignment and extends two days after the submission deadline.
Homework Plagiarism Detection and Academic Dishonesty Penalty
We use an automatic code comparison tool to help spot homework assignments that have been submitted in
violation of these rules. The tool takes all assignments from all sections and all prior terms and compares
them, highlighting regions of the code that are similar. The plagiarism tool looks at core code structure and
is not fooled by variable and function name changes or addition of comments and whitespace.
The instructor checks flagged pairs of assignments very carefully, to determine which students may have
violated the rules of collaboration and academic integrity on programming assignments. When it is believed
that an incident of academic dishonesty has occurred, the involved students are contacted and a meeting is
scheduled. All students caught cheating on a programming assignment (both the copier and the provider)
will be punished. For undergraduate students, the standard punishment for the first oense is a 0 on the
assignment and a full letter grade reduction on the final semester grade. Students whose violations are more
flagrant will receive a higher penalty. Undergraduate students caught a second time will receive an immediate
F in the course, regardless of circumstances. Each incident will be reported to the Dean of Students.
Graduate students found to be in violation of the academic integrity policy for homework assignments on
the first oense will receive an F in the course and will be reported both to the Dean of Students and to
the chair of their home department with the strong advisement that they be ineligible to serve as a teaching
assistant for any course at RPI.
Academic Dishonesty in the Student Handbook
Refer to the The Rensselaer Handbook of Student Rights and Responsibilities for further discussion of aca-
demic dishonesty. Note that: Students found in violation of the academic dishonesty policy are prohibited
from dropping the course in order to avoid the academic penalty.
Number of Students Found in Violation of the Policy
Historically, 5-10% of students are found to be in violation of the academic dishonesty policy each semester.
Many of these students immediately admit to falling behind with the coursework and violating one or more
of the rules above and if it is a minor first-time oense may receive a reduced penalty.
Read this document in its entirety. If you have any questions, contact the instructor or the
TAs immediately. Sign this form and give it to your TA during your first lab section.
Name: Section #:
Signature: Date:
2
CSCI-1200 Data Structures Spring 2017
Lecture 2 STL Strings & Vectors
Announcements
HW 1 will be available on-line this afternoon through the website (on the Calendar).
Be sure to read through this information as you start implementation of HW1:
Misc Programming Information (a link at the bottom of the left bar of the website).
TA & instructor office hours are posted on website (Weekly Schedule).
If you have not resolved issues with the C++ environment on your laptop, please do so immediately.
If you cannot access Piazza or the homework submission server, please email the instructor ASAP with your
RCS ID and section number.
Because many students were dealing with lengthy compiler/editor installation, registration confusion, etc., we
will allow (for the first lab only!) students to get checked o for any remaining Lab 1 checkpoints at the
beginning of next weeks Lab 2 or in your grad TAs normal office hours.
Today
STL Strings, char arrays (C-style Strings), & converting between these two types
L-values vs. R-values
STL Vectors as smart arrays
The expression std::string(first.size()+2, ' ') within this statement creates a temporary STL string
but does not associate it with a variable.
A char array can be initialized as: char h[] = {'H', 'e', 'l', 'l', 'o', '!', '\0'};
or as: char h[] = "Hello!";
In either case, array h has 7 characters, the last one being the null character.
The C language provides many functions for manipulating these C-style strings. We dont study them much
anymore because the C++ style STL string library is much more logical and easier to use. If you want
to find out more about functions for C-style strings look at the cstdlib library http://www.cplusplus.com/
reference/cstdlib/.
One place we do use them is in file names and command-line arguments, which you will use in Homework 1.
std::string a = "Kim";
std::string b = "Tom";
a[0] = b[0];
Lets look closely at the line: a[0] = b[0]; and think about what happens.
In particular, what is the dierence between the use of a[0] on the left hand side of the assignment statement
and b[0] on the right hand side?
Syntactically, they look the same. But,
The expression b[0] gets the char value, 'T', from string location 0 in b. This is an r-value.
The expression a[0] gets a reference to the memory location associated with string location 0 in a. This
is an l-value.
The assignment operator stores the value in the referenced memory location.
The dierence between an r-value and an l-value will be especially significant when we get to writing our own
operators later in the semester
Whats wrong with this code?
Your C++ compiler will complain with something like: non-lvalue in assignment
Our solution to this problem will be much more elegant, robust, & less error-prone if we use the STL vector
class. Why would it be more difficult/wasteful/buggy to try to write this using C-style (dumb) arrays?
Vectors are an example of a templated container class. The angle brackets < > are used to specify the type of
object (the template type) that will be stored in the vector.
2
push back is a vector function to append a value to the end of the vector, increasing its size by one. This is
an O(1) operation (on average).
There is NO corresponding push front operation for vectors.
size is a function defined by the vector type (the vector class) that returns the number of items stored in the
vector.
After vectors are initialized and filled in, they may be treated just like arrays.
In the line
sum += scores[i];
to change a score. Here scores[4] is an l-value, providing the means of storing 100 at location 4 of the
vector.
It is the job of the programmer to ensure that any subscript value i that is used is legal - at least 0 and
strictly less than scores.size().
This constructs an empty vector of integers. Values must be placed in the vector using push_back.
std::vector<int> a;
This constructs a vector of 100 doubles, each entry storing the value 3.14. New entries can be created using
push_back, but these will create entries 100, 101, 102, etc.
int n = 100;
std::vector<double> b( 100, 3.14 );
This constructs a vector of 10,000 ints, but provides no initial values for these integers. Again, new entries can
be created for the vector using push_back. These will create entries 10000, 10001, etc.
std::vector<int> c( n*n );
This is a compiler error because no constructor exists to create an int vector from a double vector. These are
dierent types.
std::vector<int> e( b );
2.8 Exercises
1. After the above code constructing the three vectors, what will be output by the following statement?
cout << a.size() << endl << b.size() << endl << c.size() << endl;
2. Write code to construct a vector containing 100 doubles, each having the value 55.5.
p p p p
3. Write code to construct a vector containing 1000 doubles, containing the values 0, 1, 2, 3, 4, 5, etc.
Write it two ways, one that uses push_back and one that does not use push_back.
n 1
3
// Compute the average and standard deviation of an input set of grades.
#include <fstream>
#include <iomanip>
#include <iostream>
#include <vector> // to access the STL vector class
#include <cmath> // to use standard math library and sqrt
return 0; // everything ok
}
4
As an example, the following code reads, sorts and outputs a vector of doubles:
double x;
std::vector<double> a;
while (std::cin >> x)
a.push_back(x);
std::sort(a.begin(), a.end());
for (unsigned int i=0; i < a.size(); ++i)
std::cout << a[i] << '\n';
a.begin() is an iterator referencing the first location in the vector, while a.end() is an iterator referencing
one past the last location in the vector.
We will learn much more about iterators in the next few weeks.
Every container has iterators: strings have begin() and end() iterators defined on them.
The ordering of values by std::sort is least to greatest (technically, non-decreasing). We will see ways to
change this.
8
>
<a(n 1)/2 if n is odd
>
: an/2 1 + an/2
2 if n is even
5
double compute_median(const std::vector<int> & scores) {
// Create a copy of the vector
std::vector<int> scores_to_sort(scores);
// Sort the values in the vector. By default this is increasing order.
std::sort(scores_to_sort.begin(), scores_to_sort.end());
// Output
std::cout << "Among " << scores.size() << " grades: \n"
<< " average = " << std::setprecision(3) << average << '\n'
<< " std_dev = " << std_dev << '\n'
<< " median = " << median << std::endl;
return 0;
}
6
This is illustrated by the functions compute avg and std dev and compute median in the program
median grade.
As a general rule, you should not pass a container object, such as a vector or a string, by value because of the
cost of copying.
7
CSCI-1200 Data Structures Spring 2017
Lecture 3 Classes I
Announcements
Submitty team is working on an iClicker solution (we will put an announcement out on Piazza) when its ready.
This will let you register through Submitty instead of the iClicker site.
Todays Lecture
Classes in C++ Types and defining new types
A Date class.
Class declaration: member variables and member functions
Homework 1 Hints
This section isnt in the printed lecture notes, but it is online.
There are three major tasks in this assignment
Reading in the layout and commands
Managing the seats in a data structure
Managing the upgrade list (not to be confused with an STL list which we havent yet covered)
One of the problems is that many people naturally want to use erase(), but we havent covered it
More importantly, we havent really discussed iterators, and theyre very important to functions like erase()
So how can we handle removing from a vector?
To empty out a vector, we can use clear().
To remove the last value of a vector, we can use pop back()
We could also remove an element by making a second vector that looks right, and then use an assignment
=.
Lets look at a small program that exercises some of these concepts.
3.1 More Vector Sample Code
#include <iostream>
#include <vector>
int main(){
std::vector<int> a;
std::vector<int> b;
a.push_back(5);
a.push_back(4);
a.push_back(3);
printVector(a, std::cout);
printVector(b, std::cout);
b = a;
printVector(b, std::cout);
b.pop_back();
printVector(a, std::cout);
a.clear();
printVector(b, std::cout);
printVector(a, std::cout);
return 0;
}
3.2 Exercise
What will be the output of the More Vector Sample Code program above?
2
Encapsulation is the packing of data and functions into a single component.
Information hiding
Users have access to interface, but not implementation
No data item should be available any more than absolutely necessary
To clarify, lets focus on strings and vectors. These are classes. Well outline what we know about them:
The structure of memory within each class object
The set of operations defined
We are now ready to start defining our own new types using classes.
#include <iostream>
#include "date.h"
int main() {
std::cout << "Please enter today's date.\n"
<< "Provide the month, day and year: ";
int month, day, year;
std::cin >> month >> day >> year;
Date today(month, day, year);
Date Sallys_Birthday(2,3,1995);
if (sameDay(tomorrow, Sallys_Birthday)) {
std::cout << "Hey, tomorrow is Sally's birthday!\n";
}
std::cout << "The last day in this month is " << today.lastDayInMonth() << std::endl;
return 0;
}
3
Important: Each object we create of type Date has its own distinct member variables.
Calling class member functions for class objects uses the dot notation. For example, tomorrow.increment();
Note: We dont need to know the implementation details of the class member functions in order to understand
this example. This is an important feature of object oriented programming and class design.
3.7 Exercise
Add code to date_main.cpp to read in another date, check if it is a leap-year, and check if it is equal to tomorrow.
Output appropriate messages based on the results of the checks.
// File: date.h
// Purpose: Header file with declaration of the Date class, including
// member functions and private member variables.
class Date {
public:
Date();
Date(int aMonth, int aDay, int aYear);
// ACCESSORS
int getDay() const;
int getMonth() const;
int getYear() const;
// MODIFIERS
void setDay(int aDay);
void setMonth(int aMonth);
void setYear(int aYear);
void increment();
// prototypes for other functions that operate on class objects are often
// included in the header file, but outside of the class declaration
bool sameDay(const Date &date1, const Date &date2); // same day & month?
And here is the other part of the class implementation, the implementation file date.cpp
// File: date.cpp
4
// Purpose: Implementation file for the Date class.
#include <iostream>
#include "date.h"
// array to figure out the number of days, it's used by the auxiliary function daysInMonth
const int DaysInMonth[13] = {0, 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31};
Date::Date(int aMonth, int aDay, int aYear) { // construct from month, day, & year
month = aMonth;
day = aDay;
year = aYear;
}
void Date::setDay(int d) {
day = d;
}
void Date::setMonth(int m) {
month = m;
}
void Date::setYear(int y) {
year = y;
}
void Date::increment() {
if (!isLastDayInMonth()) {
day++;
} else {
day = 1;
if (month == 12) { // December
month = 1;
year++;
} else {
month++;
}
}
}
5
int Date::lastDayInMonth() const {
if (month == 2 && isLeapYear())
return 29;
else
return DaysInMonth[ month ];
}
3.10 Constructors
These are special functions that initialize the values of the member variables. You have already used constructors
for string and vector objects.
The syntax of the call to the constructor mixes variable definitions and function calls. (See date main.cpp)
Default constructors have no arguments.
Multiple constructors are allowed, just like multiple functions with the same name are allowed. The compiler
determines which one to call based on the types of the arguments (just like any other function call).
When a new object is created, EXACTLY one constructor for the object is called.
6
Functions that are not members of the Date class must interact with Date objects through the class public
members (a.k.a., the public interface declared for the class). One example is the function sameDay which
accepts two Date objects and compares them by accessing their day and month values through their public
member functions.
The implementation file, date.cpp, contains the member function definitions. Note that date.h is #includeed.
date main.cpp contains the code outside the class. Again date.h again is #includeed.
The files date.cpp and date main.cpp are compiled separately and then linked to form the executable program.
g++ -c -Wall date.cpp
g++ -c -Wall date main.cpp
g++ -o date.exe date.o date main.o
or all on one line g++ -o date.exe date.cpp date main.cpp
Dierent organizations of the code are possible, but not preferable. In fact, we could have put all of the code
from the 3 files into a single file main.cpp. In this case, we would not have to compile two separate files.
In many large projects, programmers establish follow a convention with two files per class, one header file and
one implementation file. This makes the code more manageable and is recommended in this course.
This must appear consistently in both the member function declaration in the class declaration (in the .h file)
and in the member function definition (in the .cpp file).
const objects (usually passed into a function as parameters) can ONLY use const member functions. Remember,
you should only pass objects by value under special circumstances. In general, pass all objects by reference so
they arent copied, and by const reference if you dont want/need them to change.
While you are learning, you will probably make mistakes in determining which member functions should or
should not be const. Be prepared for compile warnings & errors, and read them carefully.
3.14 Exercise
Add a member function to the Date class to add a given number of days to the Date object. The number should be
the only argument and it should be an unsigned int. Should this function be const?
Rule for the duration of the Data Structures course: You may not declare new struct types, and class member
variables should not be made public. This rule will ensure you get plenty of practice writing C++ classes with good
programming style.
7
3.16 C++ vs. Java Classes
In C++, classes have sections labeled public and private, but there can be multiple public and private
sections. In Java, each individual item is tagged public or private.
Class declarations and class definitions are separated in C++, whereas they are together in Java.
In C++ there is a semi-colon at the very end of the class declaration (after the }).
Write code that uses the member functions (e.g., the main function). Revise the class .h file as necessary.
Write the class .cpp file that implements the member functions.
In general, dont be afraid of major rewrites if you find a class isnt working correctly or isnt as easy to use as you
intended. This happens frequently in practice!
8
CSCI-1200 Data Structures Spring 2017
Lecture 4 Classes II: Sort, Non-member Operators
Announcements
Excercise solutions will be posted to the calendar.
Submitty iClicker registration is still open. Even if you already registered on the iClicker website,
submit your code on Submitty.
Starting with HW2, when Submitty opens for the homework assignment, there may be a message at the top
regarding an extra late day for earning enough autograder points by Wednesday night.
Practice problems for Exam 1 will be posted Monday, but the solutions will not be posted until the weekend.
We will talk more about the exam next Tuesday.
Todays Lecture
Extended example of student grading program
Passing comparison functions to sort
Non-member operators
// File: main_student.cpp
// Purpose: Compute student averages and output them alphabetically.
#include <algorithm>
#include <fstream>
#include <iomanip>
#include <iostream>
#include <vector>
#include "student.h"
// Compute the averages. At the same time, determine the maximum name length.
unsigned int i;
unsigned int max_length = 0;
for (i=0; i<students.size(); ++i) {
students[i].compute_averages(hw_weight);
unsigned int tmp_length;
tmp_length = students[i].first_name().size() + students[i].last_name().size();
max_length = std::max(max_length, tmp_length);
}
max_length += 2; // account for the output padding with ", "
// Output a header...
out_str << "\nHere are the student semester averages\n";
const std::string header = "Name" + std::string(max_length-4, ' ') + " HW Test Final";
const std::string underline(header.size(), '-');
out_str << header << '\n' << underline << std::endl;
Functionality is relatively simple: input, compute average, provide access to names and averages, and output.
No constructor is explicitly provided: Student objects are built through the read function.
(Other code organization/designs are possible!)
Overall, the Student class design diers substantially in style from the Date class design. We will continue to
see dierent styles of class designs throughout the semester.
Note the helpful convention used in this example: all member variable names end with the _ character.
The special pre-processor directives #ifndef student h , #define student h , and #endif ensure that
this files is included at most once per .cpp file.
For larger programs with multiple class files and interdependencies, these lines are essential for successful
compilation. We suggest you get in the habit of adding these include guards to all your header files.
2
// File: student.h
// Purpose: Header for declaration of student record class and associated functions.
#ifndef __student_h_
#define __student_h_
#include <iostream>
#include <string>
#include <vector>
class Student {
public:
// ACCESSORS
const std::string& first_name() const { return first_name_; }
const std::string& last_name() const { return last_name_; }
const std::string& id_number() const { return id_number_; }
double hw_avg() const { return hw_avg_; }
double test_avg() const { return test_avg_; }
double final_avg() const { return final_avg_; }
private: // REPRESENTATION
std::string first_name_;
std::string last_name_;
std::string id_number_;
std::vector<int> hw_scores_;
double hw_avg_;
std::vector<int> test_scores_;
double test_avg_;
double final_avg_;
};
3
4.4 Implementation of Class Student
The read function is fairly sophisticated and depends heavily on the expected structure of the input data. It
also has a lot of error checking.
In many class designs, this type of input would be done by functions outside the class, with the results
passed into a constructor. Generally prefer this style because it separates elegant class design from clunky
I/O details.
The accessor functions for the names are defined within the class declaration in the header file. In this course,
you are allowed to do this for one-line functions only! For complex classes, including long definitions
within the header file has dependency and performance implications.
The computation of the averages uses some but not all of the functionality from stats.h and stats.cpp (not
included in your handout).
Output is split across two functions. Again, stylistically, it is sometimes preferable to do this outside the class.
// File: student.cpp
// Purpose: Implementation of the class Student
#include <iostream>
#include <iomanip>
#include <string>
#include <vector>
#include "student.h"
#include "std_dev.h"
// Read information about a student, returning true if the information was read correctly.
bool Student::read(std::istream& in_str, unsigned int num_homeworks, unsigned int num_tests) {
// If we don't find an id, we've reached the end of the file & silently return false.
if (!(in_str >> id_number_)) return false;
// Once we have an id number, any other failure in reading is treated as an error.
unsigned int i;
int score;
// Compute and store the hw, test and final average for the student.
void Student::compute_averages(double hw_weight) {
double dummy_stddev;
avg_and_std_dev(hw_scores_, hw_avg_, dummy_stddev);
avg_and_std_dev(test_scores_, test_avg_, dummy_stddev);
final_avg_ = hw_weight * hw_avg_ + (1 - hw_weight) * test_avg_;
}
4
std::ostream& Student::output_name(std::ostream& out_str) const {
out_str << last_name_ << ", " << first_name_;
return out_str;
}
/* alternative version
bool less_names(const Student& stu1, const Student& stu2) {
if (stu1.last_name() < stu2.last_name())
return true;
else if (stu1.last_name() == stu2.last_name())
return stu1.first_name() < stu2.first_name();
else
return false;
}
*/
4.5 Exercise
Add code to the end of the main() function to compute and output the average of the semester grades and to output
a list of the semester grades sorted into increasing order.
Fortunately, the sort function can be called with a third argument, a comparison function:
sort(students.begin(), students.end(), less names);
less_names, defined in student.cpp, is a function that takes two const references to Student objects and
returns true if and only if the first argument should be considered less than the second in the sorted order.
less_names uses the < operator defined on string objects to determine its ordering.
5
4.7 Exercise
Write a function greater_averages that could be used in place of less_names to sort the students vector so that
the student with the highest semester average is first.
When we want to write our own operators, we write them as functions with these weird names.
For example, if we write:
bool operator< (const Student& stu1, const Student& stu2) {
return stu1.last_name() < stu2.last_name() ||
(stu1.last_name() == stu2.last_name() &&
stu1.first_name() < stu2.first_name());
}
4.10 Exercise
Write an operator< for comparing two Date objects.
6
4.11 Another Class Example: Alphabetizing Names
// name_main.cpp
// Demonstrates another example with the use of classes, including an output stream operator
#include <algorithm>
#include <iostream>
#include <vector>
#include "name.h"
int main() {
std::vector<Name> names;
std::string first, last;
std::cout <<"\nEnter a sequence of names (first and last) and this program will alphabetize them\n";
std::sort(names.begin(), names.end());
std::cout << "\nHere are the names, in alphabetical order.\n";
return 0;
}
7
4.12 Name Class Declaration & Implementation
#ifndef __NAME__
#define __NAME__
// name.h
#include <iostream>
#include <string>
class Name {
public:
// CONSTRUCTOR
Name(const std::string& fst, const std::string& lst);
// ACCESSORS
// Providing a const reference to the string allows the string to be
// examined and treated as an r-value without the cost of copying it.
const std::string& first() const { return first_; }
const std::string& last() const { return last_; }
// MODIFIERS
void set_first(const std::string & fst) { first_ = fst; }
void set_last(const std::string& lst) { last_ = lst; }
private: // REPRESENTATION
std::string first_, last_;
};
#endif
// name.cpp
#include "name.h"
// Here we use special syntax to call the string class copy constructors
Name::Name(const std::string& fst, const std::string& lst)
: first_(fst), last_(lst)
{}
// operator<
bool operator< (const Name& left, const Name& right) {
return left.last()<right.last() ||
(left.last()==right.last() && left.first()<right.first());
}
8
CSCI-1200 Data Structures Spring 2017
Lecture 5 Pointers, Arrays, Pointer Arithmetic
Announcements
Submitty iClicker registration is still open. Even if you already registered on the iClicker website,
submit your code on Submitty.
Starting with HW2, when Submitty opens for the homework assignment, there may be a message at the top
regarding an extra late day for earning enough autograder points by Wednesday night.
In fact, right now its set for 12 autograder points. This is the number you see and is the points from visible
test cases.
There is no initialization of pointer variables in this two-line sequence, so the statement below is dangerous,
and may cause your program to crash! (It wont crash if the uninitialized value happens to be a legal address.)
*p = 15;
2
Assignments of integers or floats to pointers and assignments mixing pointers of dierent types are illegal.
Continuing with the above example:
int *r;
r = q; // Illegal: different pointer types;
p = 35.1; // Illegal: float assigned to a pointer
5.5 Exercise
Draw a picture for the following code sequence. What is the output to the screen?
int x = 10, y = 15;
int *a = &x;
cout << x << " " << y << endl;
int *b = &y;
*a = x * *b;
cout << x << " " << y << endl;
int *c = b;
*c = 25;
cout << x << " " << y << endl;
tests to see if p is pointing somewhere that appears to be useful before accessing and printing the value stored
at that location.
But dont make the mistake of assuming pointers are automatically initialized to NULL.
5.7 Arrays
Heres a quick example to remind you about how to use an array:
const int n = 10;
double a[n];
int i;
for ( i=0; i<n; ++i )
a[i] = sqrt( double(i) );
Remember: the size of array a is fixed at compile time. STL vectors act like arrays, but they can grow and
shrink dynamically in response to the demands of the application.
3
5.8 Stepping through Arrays with Pointers (Array Iterators)
The array code above that uses [] subscripting, can be equivalently rewritten to use pointers:
const int n = 10;
double a[n];
double *p;
for ( p=a; p<a+n; ++p )
*p = sqrt( p-a );
The assignment: p = a; takes the address of the start of the array and assigns it to p.
This illustrates the important fact that the name of an array is in fact a pointer to the start of a block of
memory. We will come back to this several times! We could also write this line as: p = &a[0]; which
means find the location of a[0] and take its address.
By incrementing, ++p, we make p point to the next location in the array.
When we increment a pointer we dont just add one byte to the address, we add the number of bytes
(sizeof) used to store one object of the specific type of that pointer. Similarly, basic addition/subtraction
of pointer variables is done in multiples of the sizeof the type of the pointer.
Since the type of p is double, and the size of double is 8 bytes, we are actually adding 8 bytes to the
address when we execute ++p.
The test p<a+n checks to see if the value of the pointer (the address) is less than n array locations beyond
the start of the array.
In this example, a+n is the memory location 80 bytes after the start of the array (n = 10 slots * 8 bytes per
slot).
We could equivalently have used the test p != a+n
In the assignment:
*p = sqrt( p-a )
p-a is the number of array locations (multiples of 8 bytes) between p and the start. This is an integer. The
square root of this value is assigned to *p.
Heres a picture to explain this example:
const int n 10
a[10]
3.00 a[9]
2.83 a[8]
increasing 2.65 a[7]
address 2.45 a[6]
value
2.23 a[5]
2.00 a[4]
1.73 a[3]
1.41 a[2]
1.00 a[1]
double [] a 0.00 a[0]
double* p
4
Note that there may or may not be unused memory between your array and the other local variables. Similarly,
the order that your local variables appear on the stack is not guaranteed (the compiler may rearrange things
a bit in an attempt to optimize performance or memory usage). A buer overflow (attempting to access an
illegal array index) may or may not cause an immediate failure depending on the layout of other critical
program memory.
std::sort( a, a+n );
5.10 Exercises
For each of the following problems, you may only use pointers and not subscripting:
1. Write code to print the array a backwards, using pointers.
2. Write code to print every other value of the array a, again using pointers.
3. Write a function that checks whether the contents of an array of doubles are sorted into increasing order. The
function must accept two arguments: a pointer (to the start of the array), and an integer indicating the size of
the array.
5
5.11 C Calling Convention
We take for granted the non-trivial task of passing data to a helper function, getting data back from that
function, and seamlessly continuing on with the program. How does that work??
A calling convention is a standardized method for passing arguments between the caller and the function.
Calling conventions vary between programming languages, compilers, and computer hardware.
In C on x86 architectures here is a generalization of what happens:
1. The caller puts all the arguments on the stack, in reverse order.
2. The caller puts the address of its code on the stack (the return address).
3. Control is transferred to the callee.
4. The callee puts any local variables on the stack.
5. The callee does its work and puts the return value in a special register (storage location).
6. The callee removes its local variables from the stack.
7. Control is transferred by removing the address of the caller from the stack and going there.
8. The caller removes the arguments from the stack.
On x86 architectures the addresses on the stack are in descending order. This is not true of all hardware.
6
5.12 Poking around in the Stack & Looking for the C Calling Convention
Lets look more closely at an example of where the compiler stores our data. Specifically, lets print out the
addresses and values of the local variables and function parameters:
int main() {
int x = 5;
int y = 7;
int answer = foo (x, &y);
std::cout << "address of x = " << &x << std::endl;
std::cout << "address of y = " << &y << std::endl;
std::cout << "address of answer = " << &answer << std::endl;
std::cout << "value at " << &x << " = " << x << std::endl;
std::cout << "value at " << &y << " = " << y << std::endl;
std::cout << "value at " << &answer << " = " << answer << std::endl;
}
Note that the first function parameters is regular integer, passed by copy. The second parameter is a passed
in as a pointer.
Note that we can print out data values or pointers the address is printed as a big integer in hexadecimal
format (beginning with Ox). This example was compiled as 32-bit program, so our addresses are 32-bits. A
64-bit program will have longer addresses.
Lets look at the program output and reverse engineer
a drawing of the stack:
0xbf23ef18
address of a = 0xbf23eef0 x= 0xbf23ef14 5
address of b = 0xbf23eef4
address of q = 0xbf23eee4 y= 0xbf23ef10 7
address of r = 0xbf23eee0 answer=0xbf23ef0c 48
value at 0xbf23eef0 = 5
value at 0xbf23eef4 = 0xbf23ef10 0xbf23ef08
value at 0xbf23ef10 = 7 0xbf23ef04
value at 0xbf23eee4 = 6
value at 0xbf23eee0 = 8 0xbf23ef00
address of x = 0xbf23ef14 0xbf23eefc
address of y = 0xbf23ef10
address of answer = 0xbf23ef0c
0xbf23eef8
value at 0xbf23ef14 = 5 b= 0xbf23eef4 0xbf23ef10
value at 0xbf23ef10 = 7
value at 0xbf23ef0c = 48
a= 0xbf23eef0 5
0xbf23eeec
Note: The unlabeled portions in our diagram of the stack 0xbf23eee8
will include the frame pointer, the return address, temp
variables (complex C++ expressions turn into many smaller q= 0xbf23eee4 6
steps of assembly), space to save registers, and padding r= 0xbf23eee0 8
between variables to meet alignment requirements. 0xbf23eedc
Note: Dierent compilers and/or dierent optimization 0xbf23eed8
levels will produce a dierent stack diagram.
7
CSCI-1200 Data Structures Spring 2017
Lecture 6 Pointers & Dynamic Memory
Announcements
Exam 1 is on Monday Feb 6, at 6pm. Check Submitty for room assignments. They might be up already, if not
they should be up by the end of today (Friday). See Lecture 5s notes for more exam-related announcements.
The next homework will be checked for memory errors on the server. Run Dr. Memory or Valgrind on
your code to detect memory errors. See http://www.cs.rpi.edu/academics/courses/spring16/csci1200/
memory_debugging.php for more information on how to run Dr. Memory or valgrind.
int x;
double y;
Static memory: variables allocated statically (with the keyword static). They are are not eliminated when
they go out of scope. They retain their values, but are only accessible within the scope where they are defined.
Dynamic memory: explicitly allocated (on the heap) as needed. This is our focus for today.
In between the new and delete statements, the memory is treated just like memory for an ordinary variable,
except the only way to access it is through pointers. Hence, the manipulation of pointer variables and values is
similar to the examples covered in Lecture 5 except that there is no explicitly named variable for that memory
other than the pointer variable.
Dynamic allocation of primitives like ints and doubles is not very interesting or significant. Whats more
important is dynamic allocation of arrays and objects.
6.3 Exercise
Whats the output of the following code? Be sure to draw a picture to help you figure it out.
2
6.4 Dynamic Allocation of Arrays
How do we allocate an array on the stack? What is the code? What memory diagram is produced by the code?
Declaring the size of an array at compile time doesnt oer much flexibility. Instead we can dynamically allocate
an array based on data. This gets us part-way toward the behavior of the standard library vector class. Heres
an example: stack heap
int main() {
std::cout << "Enter the size of the array: "; n
int n,i;
std::cin >> n;
double *a = new double[n];
i
for (i=0; i<n; ++i) { a[i] = sqrt(i); }
for (i=0; i<n; ++i) { a
if ( double(int(a[i])) == a[i] )
std::cout << i << " is a perfect square " << std::endl;
}
delete [] a;
return 0;
}
The expression new double[n] asks the system to dynamically allocate enough consecutive memory to hold n
doubles (usually 8n bytes).
Whats crucially important is that n is a variable. Therefore, its value and, as a result, the size of the
array are not known until the program is executed and the the memory must be allocated dynamically.
The address of the start of the allocated memory is assigned to the pointer variable a.
After this, a is treated as though it is an array. For example: a[i] = sqrt( i );
In fact, the expression a[i] is exactly equivalent to the pointer arithmetic and dereferencing expression *(a+i)
which we have seen several times before.
After we are done using the array, the line: delete [] a; releases the memory allocated for the entire
array and calls the destructor (well learn about these soon!) for each slot of the array. Deleting a dynamically
allocated array without the [] is an error (but it may not cause a crash or other noticeable problem, depending
on the type stored in the array and the specific compiler implementation).
Since the program is ending, releasing the memory is not a major concern. However, to demonstrate
that you understand memory allocation & deallocation, you should always delete dynamically allocated
memory in this course, even if the program is terminating.
In more substantial programs it is ABSOLUTELY CRUCIAL. If we forget to release memory repeatedly
the program can be said to have a memory leak. Long-running programs with memory leaks will eventually
run out of memory and crash.
6.5 Exercises
1. Write code to dynamically allocate an array of n integers, point to this array using the integer pointer variable
a, and then read n values into the array from the stream cin.
2. Now, suppose we wanted to write code to double the size of array a without losing the values. This requires
some work: First allocate an array of size 2*n, pointed to by integer pointer variable temp (which will become
a). Then copy the n values of a into the first n locations of array temp. Finally delete array a and assign temp
to a.
3
6.6 Dynamic Allocation of Two-Dimensional Arrays
To store a grid of data, we will need to allocate a top level array of pointers to arrays of the data. For example:
double** a = new double*[rows];
for (int i = 0; i < rows; i++) {
a[i] = new double[cols];
for (int j = 0; j < cols; j++) {
a[i][j] = double(i+1) / double (j+1);
}
}
Foo::Foo() {
static int counter = 1;
a = counter;
b = 100.0;
counter++;
}
int main() {
int n;
std::cin >> n;
Foo *things = new Foo[n];
std::cout << "size of int: " << sizeof(int) << std::endl;
std::cout << "size of double: " << sizeof(double) << std::endl;
std::cout << "size of foo object: " << sizeof(Foo) << std::endl;
for (Foo* i = things; i < things+n; i++)
std::cout << "Foo stored at: " << i << " has value " << i->value() << std::endl;
delete [] things;
}
size of int: 4
size of double: 8
size of foo object: 16
Foo stored at: 0x104800890 has value 100
Foo stored at: 0x1048008a0 has value 200
Foo stored at: 0x1048008b0 has value 300
Foo stored at: 0x1048008c0 has value 400
...
4
6.8 Memory Debugging
In addition to the step-by-step debuggers like gdb, lldb, or the debugger in your IDE, we recommend using a memory
debugger like Dr. Memory (Windows, Linux, and MacOSX) or Valgrind (Linux and MacOSX). These tools can
detect the following problems:
Reading/writing memory after it has been freed (NOTE: delete calls free)
Reading/writing o the end of mallocd blocks (NOTE: new calls malloc)
Reading/writing inappropriate areas on the stack
1 #include <iostream>
2
3 int main() {
4
5 int *p = new int;
6 if (*p != 10) std::cout << "hi" << std::endl;
7
8 int *a = new int[3];
9 a[3] = 12;
10 delete a;
11
12 }
5
~~Dr.M~~ 1 unique, 1 total invalid heap argument(s)
~~Dr.M~~ 0 unique, 0 total warning(s)
~~Dr.M~~ 1 unique, 1 total, 4 byte(s) of leak(s)
~~Dr.M~~ 0 unique, 0 total, 0 byte(s) of possible leak(s)
~~Dr.M~~ Details: /DrMemory-MacOS-1.8.0-8/drmemory/logs/DrMemory-a.out.7726.000/results.txt
Note: Dr. Memory on Windows with the Visual Studio compiler may not report a mismatched free() / delete
/ delete [] error (e.g., line 10 of the sample code above). This may happen if optimizations are enabled and the
objects stored in the array are simple and do not have their own dynamically-allocated memory that lead to their
own indirect memory leaks.
6
==31226== indirectly lost: 0 bytes in 0 blocks
==31226== possibly lost: 0 bytes in 0 blocks
==31226== still reachable: 0 bytes in 0 blocks
==31226== suppressed: 0 bytes in 0 blocks
==31226==
==31226== For counts of detected and suppressed errors, rerun with: -v
==31226== Use --track-origins=yes to see where uninitialised values come from
==31226== ERROR SUMMARY: 4 errors from 4 contexts (suppressed: 2 from 2)
7
6.13 Diagramming Memory Exercises
Draw a diagram of the heap and stack memory for each segment of code below. Use a ? to indicate that the
value of the memory is uninitialized. Indicate whether there are any errors or memory leaks during execution
of this code.
class Foo {
public:
double x;
int* y;
};
Foo a;
a.x = 3.14159;
Foo *b = new Foo;
(*b).y = new int[2];
Foo *c = b;
a.y = b->y;
c->y[1] = 7;
b = NULL;
a: 4.2
6.5
8.6
5.1
2.9
3.4
b:
8
CSCI-1200 Data Structures Spring 2017
Lecture 7 Order Notation & Basic Recursion
Review from Lectures 5 & 6
Arrays and pointers, Pointer arithmetic and dereferencing
Dierent types of memory (automatic, static, dynamic)
Todays Lecture
Algorithm Analysis
Formal Definition of Order Notation
Simple recursion
Visualization of recursion
We want to know the best we can do. (This is often quite hard.)
How do we do it? There are several options, including:
Dont do any analysis; just use the first algorithm you can think of that works.
Implement and time algorithms to choose the best.
Analyze algorithms by counting operations while assigning dierent weights to dierent types of operations
based on how long each takes.
Analyze algorithms by assuming each operation requires the same amount of time. Count the total number of
operations, and then multiply this count by the average cost of an operation.
What is the total number of operations performed in executing this fragment? Come up with a function
describing the number of operations in terms of n.
7.3 Exercise: Which Algorithm is Best?
A venture capitalist is trying to decide which of 3 startup companies to invest in and has asked for your help. Heres
the timing data for their prototype software on some dierent size test cases:
n foo-a foo-b foo-c
10 10 u-sec 5 u-sec 1 u-sec
20 13 u-sec 10 u-sec 8 u-sec
30 15 u-sec 15 u-sec 27 u-sec
100 20 u-sec 50 u-sec 1000 u-sec
1000 ? ? ?
Definition: Algorithm A is order f (n) denoted O(f (n)) if constants k and n0 exist such that A requires
no more than k f (n) time units (operations) to solve a problem of size n n0 .
For example, algorithms requiring 3n + 2, 5n 3, and 14 + 17n operations are all O(n).
This is because we can select values for k and n0 such that the definition above holds. (What values?)
Likewise, algorithms requiring n2 /10 + 15n 3 and 10000 + 35n2 are all O(n2 ).
Intuitively, we determine the order by finding the asymptotically dominant term (function of n) and throwing
out the leading constant. This term could involve logarithmic or exponential functions of n. Implications for
analysis:
We dont need to quibble about small dierences in the numbers of operations.
We also do not need to worry about the dierent costs of dierent types of operations.
We dont produce an actual time. We just obtain a rough count of the number of operations. This count
is used for comparison purposes.
In practice, this makes analysis relatively simple, quick and (sometimes unfortunately) rough.
2
7.7 Best-Case, Average-Case and Worst-Case Analysis
For a given fixed size array, we might want to know:
The fewest number of operations (best case) that might occur.
The average number of operations (average case) that will occur.
The maximum number of operations (worst case) that can occur.
The last is the most common. The first is rarely used.
On the previous algorithm, the best case is O(1), but the average case and worst case are both O(n).
3
7.12 The Mechanism of Recursive Function Calls
For each recursive call (or any function call), a program creates an activation record to keep track of:
Completely separate instances of the parameters and local variables for the newly-called function.
The location in the calling function code to return to when the newly-called function is complete. (Who
asked for this function to be called? Who wants the answer?)
Which activation record to return to when the function is done. For recursive functions this can be
confusing since there are multiple activation records waiting for an answer from the same function.
This is illustrated in the following diagram of the call fact(4). Each box is an activation record, the solid lines
indicate the function calls, and the dashed lines indicate the returns. Inside of each box we list the parameters
and local variables and make notes about the computation.
fact(4) fact(3) fact(2) fact(1) fact(0)
n=4 n=3 n=2 n=1 n=0
tmp = fact(4) result = fact(3) result = fact(2) result = fact(1) result = fact(0) return 1
return 4*6 return 3*2 return 2*1 return 1*1
24 6 2 1 1
This chain of activation records is stored in a special part of program memory called the stack.
Often writing recursive functions is more natural than writing iterative functions, especially for a first draft of
a problem implementation.
You should learn how to recognize whether an implementation is recursive or iterative, and practice rewriting
one version as the other. Note: Well see that not all recursive functions can be easily rewritten in iterative
form!
Note: The order notation for the number of operations for the recursive and iterative versions of an algorithm
is usually the same. However in C, C++, Java, and some other languages, iterative functions are generally
faster than their corresponding recursive functions. This is due to the overhead of the function call mecha-
nism. Compiler optimizations will sometimes (but not always!) reduce the performance hit by automatically
eliminating the recursive function calls. This is called tail call optimization.
7.14 Exercises
1. Draw a picture to illustrate the activation records for the function call
cout << intpow(4, 4) << endl;
4
7.15 Rules for Writing Recursive Functions
Here is an outline of five steps that are useful in writing and debugging recursive functions. Note: You dont have
to do them in exactly this order...
1. Handle the base case(s).
2. Define the problem solution in terms of smaller instances of the problem. Use wishful thinking, i.e., if someone
else solves the problem of fact(4) I can extend that solution to solve fact(5). This defines the necessary
recursive calls. It is also the hardest part!
3. Figure out what work needs to be done before making the recursive call(s).
4. Figure out what work needs to be done after the recursive call(s) complete(s) to finish the computation. (What
are you going to do with the result of the recursive call?)
5. Assume the recursive calls work correctly, but make sure they are progressing toward the base case(s)!
Exercise: How can you change the second print vec function as little as possible so that this code prints the
contents of the vector in reverse order?
5
7.18 Exercises
1. Write a non-recursive version of binary search.
2. If we replaced the if-else structure inside the recursive binsearch function (above) with
if ( x < v[mid] )
return binsearch( v, low, mid-1, x );
else
return binsearch( v, mid, high, x );
6
CSCI-1200 Data Structures Spring 2017
Lecture 8 Templated Classes & Vector Implementation
Review from Lectures 7
Algorithm Analysis, Formal Definition of Order Notation
Simple recursion, Visualization of recursion, Iteration vs. Recursion,
Rules for writing recursive functions.
Lots of examples!
public:
// MEMBER FUNCTIONS AND OTHER OPERATORS
T& operator[] (size_type i);
const T& operator[] (size_type i) const;
void push_back(const T& t);
void resize(size_type n, const T& fill_in_value = T());
void clear();
bool empty() const;
size_type size() const;
To implement our own generic (a.k.a. templated) vector class, we will implement all of these operations,
manipulate the underlying representation, and discuss memory management.
Within the class declaration, T is used as a type and all member functions are said to be templated over type
T. In the actual text of the code files, templated member functions are often defined (written) inside the class
declaration.
The templated functions defined outside the template class declaration must be preceded by the phrase:
template <class T> and then when Vec is referred to it must be as Vec<T> . For example, for member
function create (two versions), we write:
m alloc is the total number of slots in the dynamically allocated block of memory.
Drawing pictures, which we will do in class, will help clarify this, especially the distinction between m size and
m alloc.
8.6 Typedefs
Several types are created through typedef statements in the first public area of Vec. Once created the names
are used as ordinary type names. For example Vec<int>::size type is the return type of the size() function,
defined here as an unsigned int.
8.7 operator[]
Access to the individual locations of a Vec is provided through operator[]. Syntactically, use of this operator
is translated by the compiler into a call to a function called operator[]. For example, if v is a Vec<int>,
then:
v[i] = 5;
translates into:
v.operator[](i) = 5;
2
8.8 Default Versions of Assignment Operator and Copy Constructor Are Dangerous!
Before we write the copy constructor and the assignment operator, we consider what would happen if we didnt
write them.
C++ compilers provide default versions of these if they are not provided. These defaults just copy the values
of the member variables, one-by-one. For example, the default copy constructor would look like this:
In other words, it would construct each member variable from the corresponding member variable of v. This is
dangerous and incorrect behavior for the Vec class. We dont want to just copy the m_data pointer. We really
want to create a copy of the entire array! Lets look at this more closely...
8.9 Exercise
Suppose we used the default version of the assignment operator and copy constructor in our Vec<T> class. What
would be the output of the following program? Assume all of the operations except the copy constructor behave as
they would with a std::vector<double>.
3
8.12 Copy Constructor
This constructor must dynamically allocate any memory needed for the object being constructed, copy the
contents of the memory of the passed object to this new memory, and set the values of the various member
variables appropriately.
Exercise: In our Vec class, the actual copying is done in a private member function called copy. Write the
private member function copy.
8.16 Exercises
Finish the definition of Vec::push back.
4
8.17 Vec Declaration & Implementation (vec.h)
#ifndef Vec_h_
#define Vec_h_
// Simple implementation of the vector class, revised from Koenig and Moo. This
// class is implemented using a dynamically allocated array (of templated type T).
// We ensure that that m_size is always <= m_alloc and when a push_back or resize
// call would violate this condition, the data is copied to a larger array.
public:
// TYPEDEFS
typedef unsigned int size_type;
private:
// PRIVATE MEMBER FUNCTIONS
void create();
void create(size_type n, const T& val);
void copy(const Vec<T>& v);
// REPRESENTATION
T* m_data; // Pointer to first location in the allocated array
size_type m_size; // Number of elements stored in the vector
size_type m_alloc; // Number of array locations allocated, m_size <= m_alloc
};
// Create a vector with size n, each location having the given value
template <class T> void Vec<T>::create(size_type n, const T& val) {
m_data = new T[n];
m_size = m_alloc = n;
for (size_type i = 0; i < m_size; i++) {
m_data[i] = val;
}
}
5
// Create the vector as a copy of the given vector.
template <class T> void Vec<T>::copy(const Vec<T>& v) {
}
// Add the value at the last location and increment the bound
m_data[m_size] = val;
++ m_size;
}
// If n is less than or equal to the current size, just change the size. If n is
// greater than the current size, the new slots must be filled in with the given value.
// Re-allocation should occur only if necessary. push_back should not be used.
template <class T> void Vec<T>::resize(size_type n, const T& fill_in_value) {
#endif
6
baz.h foo.h bar.h vec.h lst.h
#ifndef _baz_h #ifndef _foo_h #ifndef _bar_h #ifndef _vec_h #ifndef _lst_h
#define _baz_h #define _foo_h #define _bar_h #define _vec_h #define _lst_h
class Baz { #include "baz.h" class Bar{ template <class T> template <class T>
int a() { return 2; } #include "vec.h" int c() { return 4; } class Vec { class Lst {
}; class Foo { }; int e(); int f() { return 7; }
#endif int b(); int d(); }; int g();
}; #endif template <class T> };
#endif int Vec<T>::e() { #include "lst.hpp"
return 6; #endif
}
#endif
link
the preprocessor directives if function d was implemented in bar.h,
prevent "multiple declarations" we would have a link error because the function
of class Baz in main.cpp was "multiplydefined" in foo.o & bar.o
myprog.exe/myprog.out
CSCI-1200 Data Structures Spring 2017
Lecture 9 Iterators & STL Lists
Review from Lecture 8
Designing our own container classes
Dynamically allocated memory in classes
Copy constructors, assignment operators, and destructors
Templated classes, Implementation of the DS Vec class, mimicking the STL vector class
HW3 Tips
You must write the assignment operator,
Matrix::operator=(const Matrix& other matrix)
When writing copy constructors and assignment operators, if there is dynamic memory involved, you must
copy the values, not the pointers.
Draw memory diagrams! Use small matrices (the SimpleTest() matrices are all small) so that you can draw
out the details. Follow your code line by line.
The homework assignment shows how the matrix data is organized in a double**. Which part(s) are on the
stack and which are on the heap?
If an assertion fails, your code will crash. This is by design. Fine the line number of the assertion, and see
what the assert was testing. Read the lines above it too.
Use Dr. Memory or Valgrind to catch leaks and memory errors. Not fixing these can lead to problems all over.
Lets consider quarter() of a 1x1 and of a 0x0 together.
Today
Another vector operation: pop back
Erasing items from vectors is inefficient!
Iterators and iterator operations
STL lists are a dierent sequential container class.
Returning references to member variables from member functions
Vec iterator implementation
Optional Reading: Ford & Topp Ch 6; Koenig & Moo, Sections 5.1-5.5
2
09/25/12
00:03:30 classlist_ORIGINAL.cpp 1
#include <algorithm> found = waiting[loc] == id;
#include <iostream> if (!found) ++loc;
#include <string> }
#include <vector> if (found) {
#include <assert.h> erase_from_vector(loc, waiting);
using namespace std; cout << "Student " << id << " removed from the waiting list.\n"
<< waiting.size() << " students remain on the waiting list." << endl;
void erase_from_vector(unsigned int i, vector<string>& v) { } else {
/* EXERCISE: IMPLEMENT THIS FUNCTION */ cout << "Student " << id << " is in neither the course nor the waiting list" << endl;
} }
}
// Enroll a student if there is room and the student is not already in course or on waiting list. }
void enroll_student(const string& id, unsigned int max_students,
vector<string>& enrolled, vector<string>& waiting) { int main() {
// Check to see if the student is already enrolled. // Read in the maximum number of students in the course
unsigned int i; unsigned int max_students;
for (i=0; i < enrolled.size(); ++i) { cout << "\nEnrollment program for CSCI 1200\nEnter the maximum number of students allowed\n";
if (enrolled[i] == id) { cin >> max_students;
cout << "Student " << id << " is already enrolled." << endl;
return; // Initialize the vectors
} vector<string> enrolled;
} vector<string> waiting;
// If the course isnt full, add the student.
if (enrolled.size() < max_students) { // Invariant:
enrolled.push_back(id); // (1) enrolled contains the students already in the course,
cout << "Student " << id << " added.\n" // (2) waiting contains students who will be admitted (in the order of request) if a spot opens up
<< enrolled.size() << " students are now in the course." << endl; // (3) enrolled.size() <= max_students,
return; // (4) if the course is not filled (enrolled.size() != max_students) then waiting is empty
} do {
// Check to see if the student is already on the waiting list. // check (part of) the invariant
for (i=0; i < waiting.size(); ++i) { assert (enrolled.size() <= max_students);
if (waiting[i] == id) { assert (enrolled.size() == max_students || waiting.size() == 0);
cout << "Student " << id << " is already on the waiting list." << endl; cout << "\nOptions:\n"
return; << " To enroll a student type 0\n"
} << " To remove a student type 1\n"
} << " To end type 2\n"
// If not, add the student to the waiting list. << "Type option ==> ";
waiting.push_back(id); int option;
cout << "The course is full. Student " << id << " has been added to the waiting list.\n" if (!(cin >> option)) { // if we cant read the input integer, then just fail.
<< waiting.size() << " students are on the waiting list." << endl; cout << "Illegal input. Good-bye.\n";
} return 1;
} else if (option == 2) {
// Remove a student from the course or from the waiting list. If removing the student from the break; // quit by breaking out of the loop.
// course opens up a slot, then the first person on the waiting list is placed in the course. } else if (option != 0 && option != 1) {
void remove_student(const string& id, unsigned int max_students, cout << "Invalid option. Try again.\n";
vector<string>& enrolled, vector<string>& waiting) { } else { // option is 0 or 1
// Check to see if the student is on the course list. string id;
bool found = false; cout << "Enter student id: ";
unsigned int loc=0; if (!(cin >> id)) {
while (!found && loc < enrolled.size()) { cout << "Illegal input. Good-bye.\n";
found = enrolled[loc] == id; return 1;
if (!found) ++loc; } else if (option == 0) {
} enroll_student(id, max_students, enrolled, waiting);
if (found) { } else {
// Remove the student and see if a student can be taken from the waiting list. remove_student(id, max_students, enrolled, waiting);
erase_from_vector(loc, enrolled); }
cout << "Student " << id << " removed from the course." << endl; }
if (waiting.size() > 0) { }
enrolled.push_back(waiting[0]); while (true);
cout << "Student " << waiting[0] << " added to the course from the waiting list." << endl;
erase_from_vector(0, waiting); // some nice output
cout << waiting.size() << " students remain on the waiting list." << endl; sort(enrolled.begin(), enrolled.end());
} else { cout << "\nAt the end of the enrollment period, the following students are in the class:\n\n";
cout << enrolled.size() << " students are now in the course." << endl; for (unsigned int i=0; i<enrolled.size(); ++i) { cout << enrolled[i] << endl; }
} if (!waiting.empty()) {
} else { cout << "\nStudents are on the waiting list in the following order:\n";
// Check to see if the student is on the waiting list for (unsigned int j=0; j<waiting.size(); ++j) { cout << waiting[j] << endl; }
found = false; }
loc = 0; return 0;
while (!found && loc < waiting.size()) { }
9.3 Motivating Example: Course Enrollment and Waiting List
This program maintains the class list and the waiting list for a single course. The program is structured to
handle interactive input. Error checking ensures that the input is valid.
Vectors store the enrolled students and the waiting students. The main work is done in the two functions
enroll student and remove student.
The invariant on the loop in the main function determines how these functions must behave.
9.4 Exercises
1. Write erase from vector. This function removes the value at index location i from a vector of strings. The
size of the vector should be reduced by one when the function is finished.
// Remove the value at index location i from a vector of strings. The
// size of the vector should be reduced by one when the function is finished.
void erase_from_vector(unsigned int i, vector<string>& v) {
9.6 Iterators
Heres the definition (from Koenig & Moo). An iterator:
identifies a container and a specific element stored in the container,
lets us examine (and change, except for const iterators) the value stored at that element of the container,
provides operations for moving (the iterators) between elements in the container,
restricts the available operations in ways that correspond to what the container can handle efficiently.
As we will see, iterators for dierent container classes have many operations in common. This often makes the
switch between containers fairly straightforward from the programers viewpoint.
Iterators in many ways are generalizations of pointers: many operators / operations defined for pointers are
defined for iterators. You should use this to guide your beginning understanding and use of iterators.
4
The dereference operator is combined with dot operator for accessing the member variables and member
functions of elements stored in containers. Heres an example using the Student class and students vector
from Lecture 4:
vector<Student>::iterator i = students.begin();
(*i).compute_averages(0.45);
Notes:
This operation would be illegal if i had been defined as a const iterator because compute_averages is
a non-const member function.
The parentheses on the *i are required (because of operator precedence).
There is a syntactic sugar for the combination of the dereference operator and the dot operator, which is
exactly equivalent:
vector<StudentRec>::iterator i = students.begin();
i->compute_averages(0.45);
Just like pointers, iterators can be incremented and decremented using the ++ and -- operators to move to the
next or previous element of any container.
Iterators can be compared using the == and != operators.
Iterators can be assigned, just like any other variable.
Vector iterators have several additional operations:
Integer values may be added to them or subtracted from them. This leads to statements like
enrolled.erase(enrolled.begin() + 5);
Vector iterators may be compared using operators like <, <=, etc.
For most containers (other than vectors), these random access iterator operations are not legal and
therefore prevented by the compiler. The reasons will become clear as we look at their implementations.
Note: the STL vector class has a function that does just this... called erase!
Now, edit the rest of the file to remove all use of the vector subscripting operator.
7 5 8 1 9 7 5 8 1 9
0 1 2 3 4
5
Although the interface (functions called) of lists and vectors and their iterators are quite similar, their im-
plementations are VERY dierent. Clues to these dierences can be seen in the operations that are NOT in
common, such as:
STL vectors / arrays allow random-access / indexing / [] subscripting. We can immediately jump to
an arbitrary location within the vector / array.
STL lists have no subscripting operation (we cant use [] to access data). The only way to get to the
middle of a list is to follow pointers one link at a time.
Lists have push front and pop front functions in addition to the push back and pop back functions of
vectors.
erase and insert in the middle of the STL list is very efficient, independent of the size of the list. Both
are implemented by rearranging pointers between the small blocks of memory. (Well see this when we
discuss the implementation details next week).
We cant use the same STL sort function we used for vector; we must use a special sort function defined
by the STL list type.
std::vector<int> my_vec;
std::list<int> my_lst;
// ... put some data in my_vec & my_lst
std::sort(my_vec.begin(),my_vec.end(),optional_compare_function);
my_lst.sort(optional_compare_function);
Note: STL list sort member function is just as efficient, O(n log n), and will also take the same optional
compare function as STL vector.
Several operations invalidate the values of vector iterators, but not list iterators:
erase invalidates all iterators after the point of erasure in vectors;
push back and resize invalidate ALL iterators in a vector
The value of any associated vector iterator must be re-assigned / re-initialized after these operations.
9.10 Exercise: Revising the Class List Program to Use Lists (& Iterators)
Now lets further modify the program to use lists instead of vectors. Because weve already switched to iterators,
this change will be relatively easy. And now the program will be more efficient!
7 5 8 1 9
p ? q
7 8 1 9
To reuse the iterator p and make it a valid entry, you will often see the code written:
std::list<int>::iterator p = s.begin();
++p;
p = s.erase(p);
6
Even though the erase function has the same syntax for vectors and for list, the vector version is O(n), whereas
the list version is O(1).
9.12 Insert
Similarly, there is an insert function for STL lists that takes an iterator and a value and adds a link in the
chain with the new value immediately before the item pointed to by the iterator.
The call returns an iterator that points to the newly added element. Variants on the basic insert function are
also defined.
// MODIFIERS
iterator erase(iterator p);
// ITERATOR OPERATIONS
iterator begin() { return m_data; }
const_iterator begin() const { return m_data; }
iterator end() { return m_data + m_size; }
const_iterator end() const { return m_data + m_size; }
First, remember that typedef statements create custom, alternate names for existing types.
Vec<int>::iterator is an iterator type defined by the Vec<int> class. It is just a T * (an int *). Thus,
internal to the declarations and member functions, T* and iterator may be used interchangeably.
Because the underlying implementation of Vec uses an array, and because pointers are the iterators of arrays,
the implementation of vector iterators is quite simple. Note: the implementation of iterators for other STL
containers is more involved!
Thus, begin() returns a pointer to the first slot in the m data array. And end() returns a pointer to the slot
just beyond the last legal element in the m data array (as prescribed in the STL standard).
7
CSCI-1200 Data Structures Spring 2017
Lecture 10 Vector Iterators & Linked Lists
Review from Lecture 9
Explored a program to maintain a class enrollment list and an associated waiting list.
Unfortunately, erasing items from the front or middle of vectors is inefficient.
Iterators can be used to access elements of a vector
Iterators and iterator operations (increment, decrement, erase, & insert)
STLs list class
Dierences between indices and iterators, dierences between STL list and STL vector.
Todays Class
Quick review of iterators
Implementation of iterators in our homemade Vec class (from Lecture 8)
const and reference on return values
Building our own basic linked lists:
Stepping through a list
Push back
... & even more in the next couple lectures!
Note: We can add an integer to vector and string iterators, but not to list iterators.
The contents of the specific entry referred to by an iterator are accessed using the * dereference operator:
In the first and third lines, *v itr and *l itr are l-values. In the second, *s_itr is an r-value.
*v_itr = 3.14;
cout << *s_itr << endl;
*l_itr = "Hello";
Stepping through a container, either forward and backward, is done using increment (++) and decrement (--)
operators:
++itr; itr++; --itr; itr--;
These operations move the iterator to the next and previous locations in the vector, list, or string. The
operations do not change the contents of container!
Finally, we can change the container that a specific iterator is attached to as long as the types match.
Thus, if v and w are both std::vector<double>, then the code:
v_itr = v.begin();
*v_itr = 3.14; // changes 1st entry in v
v_itr = w.begin() + 2;
*v_itr = 2.78; // changes 3rd entry in w
works fine because v_itr is a std::vector<double>::iterator, but if a is a std::vector<std::string>
then
v_itr = a.begin();
is a syntax error because of a type clash!
10.2 Additional Iterator Operations for Vector (& String) Iterators
Initialization at a random spot in the vector:
v_itr = v.begin() + i;
Jumping around inside the vector through addition and subtraction of location counts:
v_itr = v_itr + 5;
moves p 5 locations further in the vector. These operations are constant time, O(1) for vectors.
These operations are not allowed for list iterators (and most other iterators, for that matter) because of the
way the corresponding containers are built. These operations would be linear time, O(n), for lists, where n is
the number of slots jumped forward/backward. Thus, they are not provided by STL for lists.
Students are often confused by the dierence between iterators and indices for vectors. Consider the following
declarations:
std::vector<double> a(10, 2.5);
std::vector<double>::iterator p = a.begin() + 5;
unsigned int i=5;
Iterator p refers to location 5 in vector a. The value stored there is directly accessed through the * operator:
*p = 6.0;
cout << *p << endl;
The above code has changed the contents of vector a. Heres the equivalent code using subscripting:
a[i] = 6.0;
cout << a[i] << endl;
std::list<int>::iterator itr,itr2,itr3;
itr = lst.begin();// itr is pointing at the 100
++itr; // itr is now pointing at 200
*itr += 1; // 200 becomes 201
// itr += 1; // does not compile! can't advance list iterator like this
itr = lst.end(); // itr is pointing "one past the last legal value" of lst
itr--; // itr is now pointing at 500;
itr2 = itr--; // itr is now pointing at 400, itr2 is still pointing at 500
itr3 = --itr; // itr is now pointing at 300, itr3 is also pointing at 300
10.3 STL List: Erase (review) & Insert (skipped last time)
The erase member function (for STL vector and STL list) takes in a single argument, an iterator pointing
at an element in the container. It removes that item, and the function returns an iterator pointing at the
element after the removed item.
Similarly, there is an insert function for STL vector and STL list that takes in 2 arguments, an iterator and
a new element, and adds that element immediately before the item pointed to by the iterator. The function
returns an iterator pointing at the newly added element.
Even though the erase and insert functions have the same syntax for vector and for list, the vector versions
are O(n), whereas the list versions are O(1).
2
Iterators positioned on an STL vector, at or after the point of an erase operation, are invalidated. Iterators
positioned anywhere on an STL vector may be invalid after an insert (or push back or resize) operation.
Iterators attached to an STL list are not invalidated after an insert or erase (except iterators attached to
the erased element!) or push back/push front.
public:
// TYPEDEFS
typedef T* iterator;
typedef const T* const_iterator;
// MODIFIERS
iterator erase(iterator p);
// ITERATOR OPERATIONS
iterator begin() { return m_data; }
const_iterator begin() const { return m_data; }
iterator end() { return m_data + m_size; }
const_iterator end() const { return m_data + m_size; }
First, remember that typedef statements create custom, alternate names for existing types.
Vec<int>::iterator is an iterator type defined by the Vec<int> class. It is just a T * (an int *). Thus,
internal to the declarations and member functions, T* and iterator may be used interchangeably.
Because the underlying implementation of Vec uses an array, and because pointers are the iterators of arrays,
the implementation of vector iterators is quite simple. Note: the implementation of iterators for other STL
containers is more involved! Well see how STL list iterators work in a later lecture.
Thus, begin() returns a pointer to the first slot in the m data array. And end() returns a pointer to the slot
just beyond the last legal element in the m data array (as prescribed in the STL standard).
Furthermore, dereferencing a Vec<T>::iterator (dereferencing a pointer to type T) correctly returns one of
the objects in the m data, an object with type T.
And similarly, the ++, --, <, ==, !=, >=, etc. operators on pointers automatically apply to Vec iterators. We
dont need to write any additional functions for iterators, since we get all of the necessary behavior from the
underlying pointer implementation.
The erase function requires a bit more attention. Weve implemented a version of this function in the previous
lecture. The STL standard further specifies that the return value of erase is an iterator pointing to the new
location of the element just after the one that was deleted.
3
10.6 References and Return Values
A reference is an alias for another variable. For example:
string a = "Tommy";
string b = a; // a new string is created using the string copy constructor
string& c = a; // c is an alias/reference to the string object a
b[1] = 'i';
cout << a << " " << b << " " << c << endl; // outputs: Tommy Timmy Tommy
c[1] = 'a';
cout << a << " " << b << " " << c << endl; // outputs: Tammy Timmy Tammy
The reference variable c refers to the same string as variable a. Therefore, when we change c, we change a.
Exactly the same thing occurs with reference parameters to functions and the return values of functions. Lets
look at the Student class from Lecture 4 again:
class Student {
public:
const string& first_name() const { return first_name_; }
const string& last_name() const { return last_name_; }
private:
string first_name_;
string last_name_;
};
Based on our discussion of references above and looking at the class declaration, what if we wrote the following.
Would the code then be changing the internal contents of the i-th Student object?
The answer is NO! The Student class member function first_name returns a const reference. The compiler
will complain that the above code is attempting to assign a const reference to a non-const reference variable.
If we instead wrote the following, then compiler would complain that you are trying to change a const object.
const string & fname = students[i].first_name();
fname[1] = 'i'
Hence in both cases the Student class would be safe from attempts at external modification.
However, the author of the Student class would get into trouble if the member function return type was only
a reference, and not a const reference. Then external users could access and change the internal contents of an
object! This is a bad idea in most cases.
4
template <class T>
class Node {
public:
T value;
Node* ptr;
};
int main() {
Node<int>* ll; // ll is a pointer to a (non-existent) Node
ll = new Node<int>; // Create a Node and assign its memory address to ll
ll->value = 6; // This is the same as (*ll).value = 6;
ll->ptr = NULL; // NULL == 0, which indicates a "null" pointer
value 8
10.9 Definition: A Linked List ptr NULL
The definition is recursive: A linked list is either:
Empty, or
Contains a node storing a value and a pointer to a linked list.
The first node in the linked list is called the head node and the pointer to this node is called the head pointer.
The pointers value will be stored in a variable called head.
The head pointer variable is drawn with its own box. It is an individual variable. It is important to have a
separate pointer to the first node, since the first node may change.
The objects (nodes) that have been dynamically allocated and stored in the linked lists are shown as boxes,
with arrows drawn to represent pointers.
Note that this is a conceptual view only. The memory locations could be anywhere, and the actual values
of the memory addresses arent usually meaningful.
The last node MUST have NULL for its pointer value you will have all sorts of trouble if you dont ensure
this!
You should make a habit of drawing pictures of linked lists to figure out how to do the operations.
5
10.12 Exercise: Write is there
template <class T> bool is_there(Node<T>* head, const T& x) {
If the input linked list chain contains n elements, what is the order notation of is there?
We must step to the end of the linked list, remembering the pointer to the last node.
This is an O(n) operation and is a major drawback to the ordinary linked-list data structure we are
discussing now. We will correct this drawback by creating a slightly more complicated linking structure
in our next lecture.
If the input linked list chain contains n elements, what is the order notation of the implementation of
push front?
If the input linked list chain contains n elements, what is the order notation of this implementation of
push back?
10.16 Next time... Can we get better performance out of linked lists? Yes!
6
CSCI-1200 Data Structures Spring 2017
Lectures 11 Doubly Linked Lists
Review from Lecture 10
Review of iterators, implementation of iterators in our homemade Vec class
const and reference on return values
Building our own basic linked lists: Stepping through a list & push back
Todays Lecture
STL List w/ iterators vs. homemade linked list with Node objects & pointers
Basic linked list operations, continued: Insert & Remove
Common mistakes
Limitations of singly-linked lists
Doubly-linked lists:
Structure
Insert
Remove
11.1 Basic Mechanisms: Inserting a Node
There are two parts to this: finding the location where the insert must take place, and doing the insert operation.
We will ignore the find for now. We will also write only a code segment to understand the mechanism rather
than writing a complete function.
The insert operation itself requires that we have a pointer to the location before the insert location.
If p is a pointer to this node, and x holds the value to be inserted, then the following code will do the insertion.
Draw a picture to illustrate what is happening.
Node<T> * q = new Node<T>; // create a new node
q -> value = x; // store x in this node
q -> next = p -> next; // make its successor be the current successor of p
p -> next = q; // make p's successor be this new node
Note: This code will not work if you want to insert x in a new node at the front of the linked list. Why not?
2
11.6 Basic Linked Lists Mechanisms: Common Mistakes
Here is a summary of common mistakes. Read these carefully, and read them again when you have a problem that
you need to solve.
Allocating a new node to step through the linked list; only a pointer variable is needed.
Confusing the . and the -> operators.
Not setting the pointer from the last node to NULL.
Not considering special cases of inserting / removing at the beginning or the end of the linked list.
Applying the delete operator to a node (calling the operator on a pointer to the node) before it is appropriately
disconnected from the list. Delete should be done after all pointer manipulations are completed.
Pointer manipulations that are out of order. These can ruin the structure of the linked list.
Trying to use STL iterators to visit elements of a home made linked list chain of nodes. (And the reverse....
trying to use ->next and ->prev with STL list iterators.)
11.7 Limitations of Singly-Linked Lists
We can only move through it in one direction
We need a pointer to the node before the node that needs to be deleted.
Appending a value at the end requires that we step through the entire list to reach the end.
First well reimplement some of the basic mechanisms weve already worked through for singly-linked lists. In
the next lecture well build the full ds list class and will define the list iterators as a class inside a class.
Note that we now assume that we have both a head pointer, as before and a tail pointer variable, which stores
the address of the last node in the linked list.
The tail pointer is not strictly necessary, but it allows immediate access to the end of the list for efficient
push-back operations.
3
11.11 Inserting in the Middle of a Doubly-Linked List
Suppose we want to insert a new node containing the value 15 following the node containing the value 1. We
have a temporary pointer variable, p, that stores the address of the node containing the value 1. Heres a
picture of the state of aairs:
head p
tail
At this point, we are ignoring the possibility that the linked list is empty or that p points to the tail node (p
pointing to the head node doesnt cause any problems).
Exercise: write the code as just described.
4
CSCI-1200 Data Structures Spring 2017
Lecture 12 List Implementation
Exam 2 will be Monday evening March 6th from 6-8pm. Practice problems are available on the calendar.
Your exam room & zone assignment will be posted on the homework submission site by the end of the week.
Note: We are re-shuing the room & zone assignments from Exam 1.
Review from Lecture 11
Limitations of singly-linked lists
Doubly-linked lists: Structure, Insert, & Remove
Note: We didnt finish all of the special/corner cases for remove
from a doubly-linked list. Does it matter? Story time....
Todays Lecture
Our own version of the STL list<T> class, named dslist
Implementing list iterators
dslist<float>
Node<float>* head_:
Node<float>* tail_:
int size_: 3
list_iterator<float>
Node<float>* ptr_:
For each list object created by a program, we have one instance of the dslist class, and multiple instances of
the Node. For each iterator variable (of type dslist<T>::iterator) that is used in the program, we create an
instance of the list_iterator class.
operator== and operator!= are defined, but no other comparison operators are allowed.
Dont worry, well never test you on where this keyword is needed. Just be prepared to use it when working
on the homework.
12.8 Exercises
1. Write dslist<T>::push_front
2. Write dslist<T>::erase
2
02/07/17
16:24:44 dslist.h 1
#ifndef dslist_h_
#define dslist_h_ // the dslist class needs access to the private ptr_ member variable
// A simplified implementation of a generic list container class, friend class dslist<T>;
// including the iterator, but not the const_iterators. Three
// separate classes are defined: a Node class, an iterator class, and // Comparions operators are straightforward
// the actual list class. The underlying list is doubly-linked, but bool operator==(const list_iterator<T>& r) const {
// there is no dummy head node and the list is not circular. return ptr_ == r.ptr_; }
#include <cassert> bool operator!=(const list_iterator<T>& r) const {
return ptr_ != r.ptr_; }
// -----------------------------------------------------------------
// NODE CLASS private:
template <class T> // REPRESENTATION
class Node { Node<T>* ptr_; // ptr to node in the list
public:
Node() : next_(NULL), prev_(NULL) {} };
Node(const T& v) : value_(v), next_(NULL), prev_(NULL) {}
// -----------------------------------------------------------------
// REPRESENTATION // LIST CLASS DECLARATION
T value_; // Note that it explicitly maintains the size of the list.
Node<T>* next_; template <class T>
Node<T>* prev_; class dslist {
}; public:
// default constructor, copy constructor, assignment operator, & destructor
// A "forward declaration" of this class is needed dslist() : head_(NULL), tail_(NULL), size_(0) {}
template <class T> class dslist; dslist(const dslist<T>& old) { this->copy_list(old); }
dslist& operator= (const dslist<T>& old);
// ----------------------------------------------------------------- dslist() { this->destroy_list(); }
// LIST ITERATOR
template <class T> // simple accessors & modifiers
class list_iterator { unsigned int size() const { return size_; }
public: bool empty() const { return head_ == NULL; }
// default constructor, copy constructor, assignment operator, & destructor void clear() { this->destroy_list(); }
list_iterator() : ptr_(NULL) {}
list_iterator(Node<T>* p) : ptr_(p) {} // read/write access to contents
list_iterator(const list_iterator<T>& old) : ptr_(old.ptr_) {} const T& front() const { return head_->value_; }
list_iterator<T>& operator=(const list_iterator<T>& old) { T& front() { return head_->value_; }
ptr_ = old.ptr_; return *this; } const T& back() const { return tail_->value_; }
list_iterator() {} T& back() { return tail_->value_; }
// dereferencing operator gives access to the value at the pointer // modify the linked list structure
T& operator*() { return ptr_->value_; } void push_front(const T& v);
void pop_front();
// increment & decrement operators void push_back(const T& v);
list_iterator<T>& operator++() { // pre-increment, e.g., ++iter void pop_back();
ptr_ = ptr_->next_;
return *this; typedef list_iterator<T> iterator;
} iterator erase(iterator itr);
list_iterator<T> operator++(int) { // post-increment, e.g., iter++ iterator insert(iterator itr, const T& v);
list_iterator<T> temp(*this); iterator begin() { return iterator(head_); }
ptr_ = ptr_->next_; iterator end() { return iterator(NULL); }
return temp;
} private:
list_iterator<T>& operator--() { // pre-decrement, e.g., --iter // private helper functions
ptr_ = ptr_->prev_; void copy_list(const dslist<T>& old);
return *this; void destroy_list();
}
list_iterator<T> operator--(int) { // post-decrement, e.g., iter-- //REPRESENTATION
list_iterator<T> temp(*this); Node<T>* head_;
ptr_ = ptr_->prev_; Node<T>* tail_;
return temp; unsigned int size_;
} };
02/07/17
16:24:44 dslist.h 2
// ----------------------------------------------------------------- template <class T>
// LIST CLASS IMPLEMENTATION typename dslist<T>::iterator dslist<T>::erase(iterator itr) {
template <class T>
dslist<T>& dslist<T>::operator= (const dslist<T>& old) {
// check for self-assignment
if (&old != this) {
this->destroy_list();
this->copy_list(old);
}
return *this;
}
}
}
template <class T>
void dslist<T>::push_back(const T& v) { template <class T>
void dslist<T>::copy_list(const dslist<T>& old) {
Todays Lecture
Review Recursion vs. Iteration
Binary Search
Rules for writing recursive functions
Advanced Recursion problems that cannot be easily solved using iteration (for or while loops):
Merge sort
Non-linear maze search
13.1 Review: Iteration vs. Recursion
Every* recursive function can also be written iteratively. Sometimes the rewrite is quite simple and straight-
forward. Sometimes its more work.
Often writing recursive functions is more natural than writing iterative functions, especially for a first draft of
a problem implementation.
You should learn how to recognize whether an implementation is recursive or iterative, and practice rewriting
one version as the other.
Note: The order notation for the number of operations for the recursive and iterative versions of an algorithm
is usually the same. However in C, C++, Java, and some other languages, iterative functions are generally
faster than their corresponding recursive functions. This is due to the overhead of the function call mecha-
nism. Compiler optimizations will sometimes (but not always!) reduce the performance hit by automatically
eliminating the recursive function calls. This is called tail call optimization.
Now suppose that you want to find if a particular value x is in the vector somewhere. How can you do this
without looking at every value in the vector?
The solution is a recursive algorithm called binary search, based on the idea of checking the middle item of
the search interval within the vector and then looking either in the lower half or the upper half of the vector,
depending on the result of the comparison.
13.3 Exercises
1. What is the order notation of binary search?
3. If we replaced the if-else structure inside the recursive binsearch function (above) with
if ( x < v[mid] )
return binsearch( v, low, mid-1, x );
else
return binsearch( v, mid, high, x );
2
13.4 Rules for Writing Recursive Functions
Here is an outline of five steps that are useful in writing and debugging recursive functions. Note: You dont have
to do them in exactly this order...
int main() {
std::vector<double> pts(7);
pts[0] = -45.0; pts[1] = 89.0; pts[2] = 34.7; pts[3] = 21.1;
pts[4] = 5.0; pts[5] = -19.0; pts[6] = -100.3;
mergesort(pts);
for (unsigned int i=0; i<pts.size(); ++i)
std::cout << i << ": " << pts[i] << std::endl;
}
// The driver function for mergesort. It defines a scratch std::vector for temporary copies.
template <class T> void mergesort(std::vector<T>& values) {
std::vector<T> scratch(values.size());
mergesort(0, int(values.size()-1), values, scratch);
}
3
mergesort(low, mid, values, scratch);
mergesort(mid+1, high, values, scratch);
merge(low, mid, high, values, scratch);
}
Can we analyze this algorithm and determine the order notation for the number of operations it will perform?
Count the number of pairwise comparisons that are required.
The usual problem associated with a grid like this is to find words going forward, backward, up, down, or along
a diagonal. Can you find computer?
A sketch of the solution is as follows:
The grid of letters is represented as vector<string> grid; Each string represents a row. We can treat
this as a two-dimensional array.
A word to be sought, such as computer is read as a string.
A pair of nested for loops searches the grid for occurrences of the first letter in the string. Call such a
location (r, c)
4
At each such location, the occurrences of the second letter are sought in the 8 locations surrounding (r, c).
At each location where the second letter is found, a search is initiated in the direction indicated. For
example, if the second letter is at (r, c 1), the search for the remaining letters proceeds up the grid.
The implementation takes a bit of work, but is not too bad.
The implementation of this is very similar to the implementation described above until after the first letter of
a word is found.
We will look at the code during lecture, and then consider how to write the recursive function.
5
// Read in the letter grid, the words to search and print the results
int main(int argc, char* argv[]) {
if (argc != 2) {
std::cerr << "Usage: " << argv[0] << " grid-file\n";
return 1;
}
std::ifstream istr(argv[1]);
if (!istr) {
std::cerr << "Couldn't open " << argv[1] << '\n';
return 1;
}
std::vector<std::string> board;
std::string word;
std::vector<loc> path; // The sequence of locations...
std::string line;
// Input of grid from a file. Stops when character '-' is reached.
while ((istr >> line) && line[0] != '-')
board.push_back(line);
while (istr >> word) {
bool found = false;
std::vector<loc> path; // Path of locations in finding the word
// Check all grid locations. For any that have the first
// letter of the word, call the function search_from_loc
// to check if the rest of the word is there.
for (unsigned int r=0; r<board.size() && !found; ++r) {
for (unsigned int c=0; c<board[r].size() && !found; ++c) {
if (board[r][c] == word[0] &&
search_from_loc(loc(r,c), board, word, path))
found = true;
}
}
// Output results
std::cout << "\n** " << word << " ** ";
if (found) {
std::cout << "was found. The path is \n";
for(unsigned int i=0; i<path.size(); ++i)
std::cout << " " << word[i] << ": (" << path[i].row << "," << path[i].col << ")\n";
} else {
std::cout << " was not found\n";
}
}
return 0;
}
The base case occurs when the path is full or all positions around the current position have been tried.
Final Note
Weve said that recursion is sometimes the most natural way to begin thinking about designing and implementing
many algorithms. Its ok if this feels downright uncomfortable right now. Practice, practice, practice!
6
CSCI-1200 Data Structures Spring 2017
Lecture 14 Problem Solving Techniques
Review from Lecture 13
Rules for writing recursive functions:
1. Handle the base case(s).
2. Define the problem solution in terms of smaller instances of the problem. Use wishful thinking, i.e., if
someone else solves the problem of fact(4) I can extend that solution to solve fact(5). This defines the
necessary recursive calls. It is also the hardest part!
3. Figure out what work needs to be done before making the recursive call(s).
4. Figure out what work needs to be done after the recursive call(s) complete(s) to finish the computation.
(What are you going to do with the result of the recursive call?)
5. Assume the recursive calls work correctly, but make sure they are progressing toward the base case(s)!
Merge sort
Non-linear maze search
Todays Class
Today we will discuss how to design and implement algorithms using three steps or stages:
1. Generating and Evaluating Ideas
2. Mapping Ideas into Code
3. Getting the Details Right
Do you have the bounds on the loops correct? Should you end at n, n 1 or n 2?
Tidy up your notes to formalize the invariants. Study the code to make sure that your code does in fact have
it right. When possible use assertions to test your invariants. (Remember, sometimes checking the invariant is
impossible or too costly to be practical.)
Does it work on the corner cases; e.g., when the answer is on the start or end of the data, when there are
repeated values in the data, or when the data set is very small or very large?
int main() {
std::cout << "Enter a number: ";
int n;
std::cin >> n;
Given a sequence of n floating point numbers, find the two that are closest in value.
int main() {
float f;
while (std::cin >> f) {
int main() {
int x;
while (std::cin >> x) {
2
14.5 Example: Merge Sort
In Lecture 13, we saw the basic framework for the merge sort algorithm and we finished the implementation of
the merge helper function. How did we Map Ideas Into Code?
What invariants can we write down within the merge sort and merge functions? Which invariants can we test
using assertions? Which ones are too expensive (i.e., will aect the overall performance of the algorithm)?
// look at the top values, grab the smaller one, store it in the scratch vector
if (values[i] < values[j]) {
scratch[k] = values[i]; ++i;
} else {
scratch[k] = values[j]; ++j;
}
++k;
}
3
14.6 Example: Nonlinear Word Search
What did we need to think about to Get the Details Right when we finished the implementation of the
nonlinear word search program? What did we worry about when writing the first draft code (a.k.a. pseudo-
code)? When debugging, what test cases should we be sure to try? Lets try to break the code and write down
all the corner cases we need to test.
bool search_from_loc(loc position, const vector<string>& board, const string& word, vector<loc>& path) {
// We have failed to find a path from this loc, remove it from the path
path.pop_back();
return false;
}
int main() {
std::vector<int> v;
int x;
while (std::cin >> x) {
v.push_back(x);
}
4
14.8 Problem Solving Strategies
Here is an outline of the major steps to use in solving programming problems:
1. Before getting started: study the requirements, carefully!
2. Get started:
(a) What major operations are needed and how do they relate to each other as the program flows?
(b) What important data / information must be represented? How should it be represented? Consider and
analyze several alternatives, thinking about the most important operations as you do so.
(c) Develop a rough sketch of the solution, and write it down. There are advantages to working on paper
first. Dont start hacking right away!
3. Review: reread the requirements and examine your design. Are there major pitfalls in your design? Does
everything make sense? Revise as needed.
4. Details, level 1:
(a) What major classes are needed to represent the data / information? What standard library classes can
be used entirely or in part? Evaluate these based on efficiency, flexibility and ease of programming.
(b) Draft the main program, defining variables and writing function prototypes as needed.
(c) Draft the class interfaces the member function prototypes.
These last two steps can be interchanged, depending on whether you feel the classes or the main program flow
is the more crucial consideration.
5. Review: reread the requirements and examine your design. Does everything make sense? Revise as needed.
6. Details, level 2:
(a) Write the details of the classes, including member functions.
(b) Write the functions called by the main program. Revise the main program as needed.
7. Review: reread the requirements and examine your design. Does everything make sense? Revise as needed.
8. Testing:
(a) Test your classes and member functions. Do this separately from the rest of your program, if practical.
Try to test member functions as you write them.
(b) Test your major program functions. Write separate driver programs for the functions if possible. Use
the debugger and well-placed output statements and output functions (to print entire classes or data
structures, for example).
(c) Be sure to test on small examples and boundary conditions.
The goal of testing is to incrementally figure out what works line-by-line, class-by-class, function-by-function.
When you have incrementally tested everything (and fixed mistakes), the program will work.
Notes
For larger programs and programs requiring sophisticated classes / functions, these steps may need to be
repeated several times over.
Depending on the problem, some of these steps may be more important than others.
For some problems, the data / information representation may be complicated and require you to write
several dierent classes. Once the construction of these classes is working properly, accessing information
in the classes may be (relatively) trivial.
For other problems, the data / information representation may be straightforward, but whats computed
using them may be fairly complicated.
Many problems require combinations of both.
5
14.9 Design Example: Conways Game of Life
Lets design a program to simulate Conways Game of Life. Initially, due to time constraints, we will focus on the
main data structures of needed to solve the problem.
Here is an overview of the Game:
We have an infinite two-dimensional grid of cells, which can grow arbitrarily large in any direction.
We will simulate the life & death of cells on the grid through a sequence of generations.
In each generation, each cell is either alive or dead.
At the start of a generation, a cell that was dead in the previous generation becomes alive if it had exactly 3
live cells among its 8 possible neighbors in the previous generation.
At the start of a generation, a cell that was alive in the previous generation remains alive if and only if it had
either 2 or 3 live cells among its 8 possible neighbors in the previous generation.
With fewer than 2 neighbors, it dies of loneliness.
With more than 3 neighbors, it dies of overcrowding.
Important note: all births & deaths occur simultaneously in all cells at the start of a generation.
Other birth / death rules are possible, but these have proven to be a very interesting balance.
Many online resources are available with simulation applets, patterns, and history. For example:
http://www.math.com/students/wonders/life/life.html
http://www.radicaleye.com/lifepage/patterns/contents.html
http://www.bitstorm.org/gameoflife/
http://en.wikipedia.org/wiki/Conways_Game_of_Life
Getting Started
What are the important operations?
How do we organize the operations to form the flow of control for the main program?
What data/information do we need to represent?
What will be the main challenges for this implementation?
Details
New Classes? Which STL classes will be useful?
Testing
Test Cases?
6
CSCI-1200 Data Structures Spring 2017
Lecture 15 Problem Solving Techniques, Continued
Review of Lecture 14
General Problem Solving Techniques:
1. Generating and Evaluating Ideas
2. Mapping Ideas into Code
3. Getting the Details Right
Small exercises to practice these techniques
Problem Solving Strategies / Checksheet
Today!
More on Complexity
Problem Solving Example: Quicksort (& compare to Mergesort)
Design Example: Conways Game of Life
int ret = 0;
//Make s calls
for(int i=0; i<s; i++){
ret += func(s,layer-1);
}
return ret;
}
func(1,8); => 1
func(2,8); => 256
func(3,8); => 6561
func(4,8); => 65536
15.1 Example: Quicksort
Quicksort also the partition-exchange sort is another efficient sorting algorithm. Like mergesort, it is a divide
and conquer algorithm.
The steps are:
1. Pick an element, called a pivot, from the array.
2. Reorder the array so that all elements with values less than the pivot come before the pivot, while all
elements with values greater than the pivot come after it (equal values can go either way). After this
partitioning, the pivot is in its final position. This is called the partition operation.
3. Recursively apply the above steps to the sub-array of elements with smaller values and separately to the
sub-array of elements with greater values.
// Choose a "pivot" and rearrange the vector. Returns the location of the
// pivot, separating top & bottom (hopefully it's near the halfway point).
int partition(vector<double>& data, int start, int end, int& swaps) {
int mid = (start + end)/2;
double pivot = data[mid];
}
}
2
What value should you choose as the pivot? What are our dierent options?
What is the order notation for the running time of this algorithm?
What is the order notation for the additional memory use of this algorithm?
What is the best case for this algorithm? What is the worst case for this algorithm?
Compare the design of Quicksort and Mergesort. What is the same? What is dierent?
Many online resources are available with simulation applets, patterns, and history. For example:
http://www.math.com/students/wonders/life/life.html
http://www.radicaleye.com/lifepage/patterns/contents.html
http://www.bitstorm.org/gameoflife/
http://en.wikipedia.org/wiki/Conways_Game_of_Life
3
Applying the Problem Solving Strategies
In class we will brainstorm about how to write a simulation of the Game of Life, focusing on the representation of
the grid and on the actual birth and death processes.
Getting Started
What are the important operations?
How do we organize the operations to form the flow of control for the main program?
What data/information do we need to represent?
What will be the main challenges for this implementation?
Details
New Classes? Which STL classes will be useful?
Testing
Test Cases?
What variables will control the running time & memory use of this program? What is the order notation in
terms of these variables for running time & memory use?
What incremental (baby step) improvements can be made to the naive program? How will the order notation
be improved?
What information do we need to store? What C++ or STL data types might be helpful? What new classes
might we want to implement?
4
15.5 Getting the Details Right
What are the simplest test cases we can start with (to make sure the control flow is correct)?
What are some specific (simple) corner test cases we should write so we wont be surprised when we move to
bigger test cases?
What are the limitations of our approach? Are there certain test cases we wont handle correctly?
What is the maximum test case that can be handled in a reasonable amount of time? How can we measure
the performance of our algorithm & implementation?
5
CSCI-1200 Data Structures Spring 2017
Lecture 16 Associative Containers (Maps), Part 1
Review from Lectures 14 & 15
How to design and implement algorithms using three steps or stages:
1. Generating and Evaluating Ideas
2. Mapping Ideas into Code
3. Getting the Details Right
Lots of Examples
In our first two examples above, key type is a string. In the first example, the value type is an int and in
the second it is a std::vector<int>.
Entries in maps are pairs:
First, lets see how this some of this works with a program to count the occurrences of each word in a file. Well look
at more details and more examples later.
16.2 Counting Word Occurrences
Heres a simple and elegant solution to this problem using a map:
#include <iostream>
#include <map>
#include <string>
int main() {
std::string s;
std::map<std::string, int> counters; // store each word and an associated counter
// read the input, keeping track of each word and how often we see it
while (std::cin >> s)
++counters[s];
"spot" 1
16.4 STL Pairs
The mechanics of using std::pairs are relatively straightforward:
std::pairs are a templated struct with just two members, called first and second. Reminder: a struct
is basically a wimpy class and in this course you arent allowed to create new structs. You should use classes
instead.
To work with pairs, you must #include <utility>. Note that the header file for maps (#include <map>)
itself includes utility, so you dont have to include utility explicitly when you use pairs with maps.
Here are simple examples of manipulating pairs:
The function std::make pair creates a pair object from the given values. It is really just a simplified con-
structor, and as the example shows there are other ways of constructing pairs.
Most of the statements in the above code show accessing and changing values in pairs.
2
The two statements at the end are commented out because they cause syntax errors:
In (a), the first entry of p3 is const, which means it cant be changed.
In (b), the two pairs are dierent types! Make sure you understand this.
The const is needed to ensure that the keys arent changed! This is crucial because maps are sorted by keys!
But operator[] is actually a function call, so it can do things that arent so simple too, for example:
++counters[s];
For maps, the [] operator searches the map for the pair containing the key (string) s.
If such a pair containing the key is not there, the operator:
1. creates a pair containing the key and a default initialized value,
2. inserts the pair into the map in the appropriate position, and
3. returns a reference to the value stored in this new pair (the second component of the pair).
This second component may then be changed using operator++.
If a pair containing the key is there, the operator simply returns a reference to the value in that pair.
In this particular example, the result in either case is that the ++ operator increments the value associated with
string s (to 1 if the string wasnt already it a pair in the map).
For the user of the map, operator[] makes the map feel like a vector, except that indexing is based on a
string (or any other key) instead of an int.
Note that the result of using [] is that the key is ALWAYS in the map afterwards.
Each iterator refers to a pair stored in the map. Thus, given map iterator it, it->first is a const string
and it->second is an int. Notice the use of it-> , and remember it is just shorthand for (*it).
16.7 Exercise
Write code to create a map where the key is an integer and the value is a double. (Yes, an integer key!) Store each
of the following in the map: 100 and its sqrt, 100,000 and its sqrt, 5 and its sqrt, and 505 and its sqrt. Write code
to output the contents of the map. Draw a picture of the map contents. What will the output be?
3
16.8 Map Find
One of the problems with operator[] is that it always places a key / value pair in the map. Sometimes we
dont want this and instead we just want to check if a key is there.
The find member function of the map class does this for us. For example:
m.find(key);
where m is the map object and key is the search key. It returns a map iterator:
If the key is in one of the pairs stored in the map, find returns an iterator referring to this pair.
If the key is not in one of the pairs stored in the map, find returns m.end().
m.insert(std::make_pair(key, value));
insert returns a pair, but not the pair we might expect. Instead it is pair of a map iterator and a bool:
The insert function checks to see if the key being inserted is already in the map.
If so, it does not change the value, and returns a (new) pair containing an iterator referring to the existing
pair in the map and the bool value false.
If not, it enters the pair in the map, and returns a (new) pair containing an iterator referring to the newly
added pair in the map and the bool value true.
size_type erase(const key_type& k) erase the pair containing key k, returning either 0 or 1, depending
on whether or not the key was in a pair in the map
16.11 Exercise
Re-write the word count program so that it uses find and insert instead of operator[].
4
CSCI-1200 Data Structures Spring 2017
Lecture 17 Associative Containers (Maps), Part 2
Review of Lecture 16
Maps are associations between keys and values.
Maps have fast insert, access and remove operations: O(log n), well learn why next week when we study the
implementation!
How do these approaches compare? Which is cleanest, easiest, and most efficient, etc.?
Maps whose keys are class objects, example: maintaining student records.
Lists vs. Graphs vs. Trees
Intro to Binary Trees, Binary Search Trees, & Balanced Trees
Note that the space between the > > is required p "hello" 15 5
(by many compiler parsers).
Otherwise, >> is treated as an operator.
Heres the syntax for entering the number 5
in the vector associated with the string "hello":
m[string("hello")].push_back(5);
Heres the syntax for accessing the size of the vector stored
in the map pair referred to by map iterator p:
p = m.find(string("hello"));
p->second.size()
Now, if you want to access (and change) the ith entry in this vector you can either using subscripting:
(p->second)[i] = 15;
(the parentheses are needed because of precedence) or you can use vector iterators:
vector<int>::iterator q = p->second.begin() + i;
*q = 15;
Both of these, of course, assume that at least i+1 integers have been stored in the vector (either through the
use of push back or through construction of the vector).
We can figure out the correct syntax for all of these by drawing pictures to visualize the contents of the map
and the pairs stored in the map. We will do this during lecture, and you should do so all the time in practice.
17.2 Exercise
Write code to count the odd numbers stored in the map
This will require testing all contents of each vector in the map. Try writing the code using subscripting on the vectors
and then again using vector iterators.
int main() {
map<string, vector<int> > words_to_lines;
string line;
int line_number = 0;
2
// Find if each word is already in the map.
for (vector<string>::iterator p = words.begin(); p!= words.end(); ++p) {
// If not, create a new entry with an empty vector (default) and
// add to index to the end of the vector
map<string, vector<int> >::iterator map_itr = words_to_lines.find(*p);
if (map_itr == words_to_lines.end())
words_to_lines[*p].push_back(line_number); // could use insert here
// If it is, check the last entry to see if the line number is
// already there. If not, add it to the back of the vector.
else if (map_itr->second.back() != line_number)
map_itr->second.push_back(line_number);
}
}
class Name {
public:
Name(const string& first, const string& last) :
m_first(first), m_last(last) {}
const string& first() const { return m_first; }
const string& last() const { return m_last; }
private:
string m_first;
string m_last;
};
class CourseGrade {
public:
Course(const string &c_name, const string & grade) : course_name(c_name), final_grade(grade) {}
const string & get_course_name() const { return course_name; }
const string & get_final_grade() const { return final_grade; }
private:
string course_name;
string final_grade;
};
3
class StudentRecord {
public:
const string& getAddress() const { return address; }
const string& getGradeInCourse(const string &course_name) const; /* implementation omitted */
bool hasCompletedCourse(const string &course_name) const; /* implementation omitted */
float getGPA() const { return GPA; }
/* additional member functions omitted */
private:
string address;
vector<CourseGrade> completed_coursework;
float GPA;
/* etc. */
};
Now if we want to create a map of student names and associated student records, we need to add an operator<
for Name objects. This is simple:
bool operator< (const Name& left, const Name& right) {
return left.last() < right.last() ||
(left.last() == right.last() && left.first() < right.first());
}
17.5 Exercises
First lets draw a picture of this map data structure populated with interesting data:
So what are the advantages of organizing this data using a map in this way? Lets assume there are s students,
c dierent classes oered at the school, each student takes up to k classes before graduation, and at most p
students take a particular course.
Write a fragment of code to access student Xs grade in course Y. What is the order notation of this
operation?
Write a fragment of code to make a list of all students who have taken course Y. What is the order notation
of this operation?
4
17.6 Typedefs
One of the painful aspects of using maps is the syntax. For example, consider a constant iterator in a map
associating strings and vectors of ints:
map < string, vector<int> > :: const_iterator p;
Typedefs are a syntactic means of shortening this. For example, if you place the line:
typedef map < string, vector<int> > map_vect;
before your main function (and any function prototypes), then anywhere you want the map you can just use
the identifier map_vect:
map_vect :: const_iterator p;
5
17.10 Definition: Binary Search Trees mouse
17.12 Exercise
Consider the following values:
1. Draw a binary tree with these values that is NOT a binary search tree.
2. Draw two dierent binary search trees with these values. Important note: This shows that the binary search
tree structure for a given set of values is not unique!
3. How many exactly balanced binary search trees exist with these numbers? How many exactly balanced
binary trees exist with these numbers?
6
CSCI-1200 Data Structures Spring 2017
Lecture 18 Trees, Part I
Review from Lectures 17
Maps containing more complicated values. Example: index mapping words to the text line numbers on which
they appear.
Maps whose keys are class objects. Example: maintaining student records.
Summary discussion of when to use maps.
Lists vs. Graphs vs. Trees
Intro to Binary Trees, Binary Search Trees, & Balanced Trees
Todays Lecture
Finish Intro to Binary Trees, Binary Search Trees, & Balanced Trees
STL set container class (like STL map, but without the pairs!)
Implementation of ds_set class using binary search trees
In-order, pre-order, and post-order traversal
Breadth-first and depth-first tree search
18.7 Exercises
1. Write a templated function to find the smallest value stored in a binary search tree whose root node is pointed
to by p.
2. Write a function to count the number of odd numbers stored in a binary tree (not necessarily a binary search
tree) of integers. The function should accept a TreeNode<int> pointer as its sole argument and return an
integer. Hint: think recursively!
2
18.9 ds set: Class Overview
There is two auxiliary classes, TreeNode and tree_iterator. All three classes are templated.
The only member variables of the ds_set class are the root and the size (number of tree nodes).
The iterator class is declared internally, and is eectively a wrapper on the TreeNode pointers.
Note that operator* returns a const reference because the keys cant change.
The increment and decrement operators are missing (well fill this in next week in lecture!).
The main public member functions just call a private (and often recursive) member function (passing the root
node) that does all of the work.
Because the class stores and manages dynamically allocated memory, a copy constructor, operator=, and
destructor must be provided.
18.10 Exercises
1. Provide the implementation of the member function ds_set<T>::begin. This is essentially the problem of
finding the node in the tree that stores the smallest value.
What is the in-order traversal of this tree? Hint: it is monotonically increasing, which is always true for
an in-order traversal of a binary search tree!
What is the post-order traversal of this tree? Hint, it ends with 4 and the 3rd element printed is 2.
3
What is the pre-order traversal of this tree? Hint, the last element is the same as the last element of the
in-order traversal (but that is not true in general! why not?)
Now lets write code to print out the elements in a binary tree in each of these three orders. These functions
are easy to write recursively, and the code for the three functions looks amazingly similar. Heres the code for
an in-order traversal to print the contents of a tree:
How would you modify this code to perform pre-order and post-order traversals?
When we hit a leaf we step back out, but only to the last decision point and then proceed to the next leaf.
This search method will quickly investigate leaf nodes, but if it has made incorrect branch decision early
in the search, it will take a long time to work back to that point and go down the right branch.
In a breadth-first search, the nodes are visited with priority based on their distance from the root, with
nodes closer to the root visited first.
In other words, we visit the nodes by level, first the root (level 0), then all children of the root (level 1),
then all nodes 2 links from the root (level 2), etc.
If there are multiple solution nodes, this search method will find the solution node with the shortest path
to the root node.
However, the breadth-first search method is memory-intensive, because the implementation must store all
nodes at the current level and the worst case number of nodes on each level doubles as we progress down
the tree!
Both depth-first and breadth-first will eventually visit all elements in the tree.
Note: The ordering of elements visited by depth-first and breadth-first is not fully specified.
In-order, pre-order, and post-order are all examples of depth-first tree traversals.
What is a breadth-first traversal of the elements in our sample binary search tree above? (Well write and
discuss code for breadth-first traversal next lecture!)
4
// Partial implementation of binary-tree based set class similar to std::set.
// The iterator increment & decrement operations have been omitted.
#ifndef ds_set_h_
#define ds_set_h_
#include <iostream>
#include <utility>
// -------------------------------------------------------------------
// TREE NODE CLASS
template <class T>
class TreeNode {
public:
TreeNode() : left(NULL), right(NULL) {}
TreeNode(const T& init) : value(init), left(NULL), right(NULL) {}
T value;
TreeNode* left;
TreeNode* right;
};
// -------------------------------------------------------------------
// TREE NODE ITERATOR CLASS
template <class T>
class tree_iterator {
public:
tree_iterator() : ptr_(NULL) {}
tree_iterator(TreeNode<T>* p) : ptr_(p) {}
tree_iterator(const tree_iterator& old) : ptr_(old.ptr_) {}
~tree_iterator() {}
tree_iterator& operator=(const tree_iterator& old) { ptr_ = old.ptr_; return *this; }
// operator* gives constant access to the value at the pointer
const T& operator*() const { return ptr_->value; }
// comparions operators are straightforward
bool operator==(const tree_iterator& r) { return ptr_ == r.ptr_; }
bool operator!=(const tree_iterator& r) { return ptr_ != r.ptr_; }
// increment & decrement will be discussed in Lecture 19 and Lab 11
private:
// representation
TreeNode<T>* ptr_;
};
// -------------------------------------------------------------------
// DS SET CLASS
template <class T>
class ds_set {
public:
ds_set() : root_(NULL), size_(0) {}
ds_set(const ds_set<T>& old) : size_(old.size_) {
root_ = this->copy_tree(old.root_); }
~ds_set() { this->destroy_tree(root_); root_ = NULL; }
ds_set& operator=(const ds_set<T>& old) {
if (&old != this) {
this->destroy_tree(root_);
root_ = this->copy_tree(old.root_);
size_ = old.size_;
}
return *this;
}
5
// FIND, INSERT & ERASE
iterator find(const T& key_value) { return find(key_value, root_); }
std::pair< iterator, bool > insert(T const& key_value) { return insert(key_value, root_); }
int erase(T const& key_value) { return erase(key_value, root_); }
// ITERATORS
iterator begin() const {
// Implemented in Lecture 18
}
iterator end() const { return iterator(NULL); }
private:
// REPRESENTATION
TreeNode<T>* root_;
int size_;
#endif
6
CSCI-1200 Data Structures Spring 2017
Lecture 19 Trees, Part II
Review from Lecture 18 and Lab 10
Binary Trees, Binary Search Trees, & Balanced Trees
STL set container class (like STL map, but without the pairs!)
Todays Lecture
Warmup / Review: destroy_tree
A very important ds set operation insert
In-order, pre-order, and post-order traversal; Breadth-first and depth-first tree search
Finding the in-order successor of a binary tree node, tree itertor increment
ds_set<T>
root:
size: 8 Node<T>
v: 7
l: r:
Node<T> Node<T>
19.2 Insert v: 5 v: 20
l: r: NULL l: r:
Move left and right down the tree based on com-
paring keys. The goal is to find the location to
do an insert that preserves the binary search tree Node<T> Node<T> Node<T>
ordering property. v: 2 v: 14 v: 25
pointer location.
Exercise: Why does this work? Is there always Node<T> Node<T>
IMPORTANT NOTE: Passing pointers by reference ensures that the new node is truly inserted into the tree.
This is subtle but important.
How would you modify this code to perform pre-order and post-order traversals?
2
19.5 General-Purpose Breadth-First Search/Tree Traversal
Write an algorithm to print the nodes in the tree one tier at a time, that is, in a breadth-first manner.
What is the best/average/worst-case running time of this algorithm? What is the best/average/worst-case
memory usage of this algorithm? Give a specific example tree that illustrates each case.
19.7 B+ Trees
Unlike binary search trees, nodes in B+ trees (and their predecessor, the B tree) have up to b children. Thus
B+ trees are very flat and very wide. This is good when it is very expensive to move from one node to another.
B+ trees are supposed to be associative (i.e. they have key-value pairs), but we will just focus on the keys.
Just like STL map and STL set, these keys and values can be any type, but keys must have an operator<
defined.
We can use all our normal terminology, but well also refer to non-leaf nodes as internal nodes.
In a B tree value-key pairs can show up anywhere in the tree, in a B+ tree all the key-value pairs are in the
leaves and the internal nodes contain duplicates of some keys.
In either type of tree, all leaves are the same distance from the root.
The keys are always sorted in a B/B+ tree node, and there are up to b 1 of them. They act like b 1 binary
search tree nodes mashed together.
b
In fact, with the exception of the root, nodes will always have between roughly 2 and b 1 keys (in our
implementation).
If a B+ tree node has k keys key0 , key1 , key2 , . . . , keyk , it will have k + 1 children. The keys in the leftmost
child must be < key0 , the next child must have keys such that they are key0 and < key1 , and so on up to
the rightmost child which has only keys keyk .
3
HW8 will focus on implementing some of the functionality of a B+ tree. It wont be enough to replace a real
B+ tree, but it will be enough to understand how the tree works and construct trees.
ant e
a ant b c d e f
4
CSCI-1200 Data Structures Spring 2017
Lecture 20 Trees, Part III
Review from Lecture 18 & 19
Overview of the ds set implementation
begin, find, destroy_tree, insert
In-order, pre-order, and post-order traversal; Breadth-first and depth-first tree search
template <class T>
void breadth_first_print (TreeNode<T> *p) {
if (p != NULL) {
std::list<TreeNode<T>*> current_level;
current_level.push_back(p);
while (current_level.size() != 0) {
std::list<TreeNode<T>*> next_level;
for (std::list<TreeNode<T>*>::iterator itr = current_level.begin();
itr != current_level.end(); itr++) {
std::cout << (*itr)->value;
if ((*itr)->left != NULL) { next_level.push_back((*itr)->left); }
if ((*itr)->right != NULL) { next_level.push_back((*itr)->right); }
}
current_level = next_level;
}
}
}
B + tree overview
Todays Lecture
Iterators
In what order should a forward iterator visit the data? Draw an abstract table representation of this data
(omits details of TreeNode memory layout).
20.2 Erase
First we need to find the node to remove. Once it is found,
the actual removal is easy if the node has no children or only one child.
Draw picture of each case!
mouse
It is harder if there are two children:
giraffe snake
Find the node with the greatest value in the left subtree or the
node with the smallest value in the right subtree.
The value in this node may be safely moved into the current node af hl nr tz
because of the tree ordering.
Then we recursively apply erase to remove that node which is
guaranteed to have at most one child. lion
af hk nr tz
Exercise: How does the order that nodes are deleted aect the tree structure? Starting with a mostly balanced
tree, give an erase ordering that yields an unbalanced tree.
What is the best/average/worst-case running time of this algorithm? What is the best/average/worst-case
memory usage of this algorithm? Give a specific example tree that illustrates each case.
2
20.4 Shortest Paths to Leaf Node
Now lets write a function to instead calculate the shortest path to a NULL child pointer.
What is the running time of this algorithm? Can we do better? Hint: How does a breadth-first vs. depth-first
algorithm for this problem compare?
Unlike the situation with lists and vectors, these predecessors and successors are not necessarily nearby
(either in physical memory or by following a link) in the tree, as examples we draw in class will illustrate.
There are two common solution approaches:
Each node stores a parent pointer. Only the root node has a null parent pointer. [method 1]
Each iterator maintains a stack of pointers representing the path down the tree to the current node.
[method 2]
If we choose the parent pointer method, well need to rewrite the insert and erase member functions to
correctly adjust parent pointers.
Although iterator increment looks expensive in the worst case for a single application of operator++, it is fairly
easy to show that iterating through a tree storing n nodes requires O(n) operations overall.
Exercise: [method 1] Write a fragment of code that given a node, finds the in-order successor using parent pointers.
Be sure to draw a picture to help you understand!
Exercise: [method 2] Write a fragment of code that given a tree iterator containing a pointer to the node and a
stack of pointers representing path from root to node, finds the in-order successor (without using parent pointers).
Either version can be extended to complete the implementation of increment/decrement for the ds_set tree iterators.
Exercise: What are the advantages & disadvantages of each method?
3
// -------------------------------------------------------------------
// TREE NODE CLASS
template <class T> class TreeNode {
public:
TreeNode() : left(NULL), right(NULL), parent(NULL) {}
TreeNode(const T& init) : value(init), left(NULL), right(NULL), parent(NULL) {}
T value;
TreeNode* left;
TreeNode* right;
TreeNode* parent; // to allow implementation of iterator increment & decrement
};
template <class T> class ds_set;
// -------------------------------------------------------------------
// TREE NODE ITERATOR CLASS
template <class T> class tree_iterator {
public:
tree_iterator() : ptr_(NULL), set_(NULL) {}
tree_iterator(TreeNode<T>* p, const ds_set<T> * s) : ptr_(p), set_(s) {}
// operator* gives constant access to the value at the pointer
const T& operator*() const { return ptr_->value; }
// comparions operators are straightforward
bool operator== (const tree_iterator& rgt) { return ptr_ == rgt.ptr_; }
bool operator!= (const tree_iterator& rgt) { return ptr_ != rgt.ptr_; }
// increment & decrement operators
tree_iterator<T> & operator++() {
if (ptr_->right != NULL) { // find the leftmost child of the right node
ptr_ = ptr_->right;
while (ptr_->left != NULL) { ptr_ = ptr_->left; }
} else { // go upwards along right branches... stop after the first left
while (ptr_->parent != NULL && ptr_->parent->right == ptr_) { ptr_ = ptr_->parent; }
ptr_ = ptr_->parent;
}
return *this;
}
tree_iterator<T> operator++(int) { tree_iterator<T> temp(*this); ++(*this); return temp; }
tree_iterator<T> & operator--() {
if (ptr_ == NULL) { // so that it works for end()
assert (set_ != NULL);
ptr_ = set_->root_;
while (ptr_->right != NULL) { ptr_ = ptr_->right; }
} else if (ptr_->left != NULL) { // find the rightmost child of the left node
ptr_ = ptr_->left;
while (ptr_->right != NULL) { ptr_ = ptr_->right; }
} else { // go upwards along left brances... stop after the first right
while (ptr_->parent != NULL && ptr_->parent->left == ptr_) { ptr_ = ptr_->parent; }
ptr_ = ptr_->parent;
}
return *this;
}
tree_iterator<T> operator--(int) { tree_iterator<T> temp(*this); --(*this); return temp; }
private:
// representation
TreeNode<T>* ptr_;
const ds_set<T>* set_;
};
// -------------------------------------------------------------------
// DS_ SET CLASS
template <class T> class ds_set {
public:
ds_set() : root_(NULL), size_(0) {}
ds_set(const ds_set<T>& old) : size_(old.size_) { root_ = this->copy_tree(old.root_,NULL); }
~ds_set() { this->destroy_tree(root_); root_ = NULL; }
ds_set& operator=(const ds_set<T>& old) {
if (&old != this) {
this->destroy_tree(root_);
4
root_ = this->copy_tree(old.root_,NULL);
size_ = old.size_;
}
return *this;
}
typedef tree_iterator<T> iterator;
friend class tree_iterator<T>;
int size() const { return size_; }
bool operator==(const ds_set<T>& old) const { return (old.root_ == this->root_); }
// FIND, INSERT & ERASE
iterator find(const T& key_value) { return find(key_value, root_); }
std::pair< iterator, bool > insert(T const& key_value) { return insert(key_value, root_, NULL); }
int erase(T const& key_value) { return erase(key_value, root_); }
// ITERATORS
iterator begin() const {
if (!root_) return iterator(NULL,this);
TreeNode<T>* p = root_;
while (p->left) p = p->left;
return iterator(p,this);
}
iterator end() const { return iterator(NULL,this); }
private:
// REPRESENTATION
TreeNode<T>* root_;
int size_;
// PRIVATE HELPER FUNCTIONS
TreeNode<T>* copy_tree(TreeNode<T>* old_root, TreeNode<T>* the_parent) {
if (old_root == NULL) return NULL;
TreeNode<T> *answer = new TreeNode<T>();
answer->value = old_root->value;
answer->left = copy_tree(old_root->left,answer);
answer->right = copy_tree(old_root->right,answer);
answer->parent = the_parent;
return answer;
}
void destroy_tree(TreeNode<T>* p) {
if (!p) return;
destroy_tree(p->right);
destroy_tree(p->left);
delete p;
}
iterator find(const T& key_value, TreeNode<T>* p) {
if (!p) return end();
if (p->value > key_value) return find(key_value, p->left);
else if (p->value < key_value) return find(key_value, p->right);
else return iterator(p,this);
}
std::pair<iterator,bool> insert(const T& key_value, TreeNode<T>*& p, TreeNode<T>* the_parent) {
if (!p) {
p = new TreeNode<T>(key_value);
p->parent = the_parent;
this->size_++;
return std::pair<iterator,bool>(iterator(p,this), true);
}
else if (key_value < p->value)
return insert(key_value, p->left, p);
else if (key_value > p->value)
return insert(key_value, p->right, p);
else
return std::pair<iterator,bool>(iterator(p,this), false);
}
int erase(T const& key_value, TreeNode<T>* &p) {
/* Implemented in Lecture 20 */
}
};
5
CSCI-1200 Data Structures Spring 2017
Lecture 21 Operators & Friends
Announcements: Test 3 Information
Test 3 will be held Monday, April 10th from 6-7:50pm.
Your exam room & zone assignment are posted on the homework submission site.
Note: We have re-shued the room & zone assignments from Exams 1 & 2.
No make-ups will be given except for emergency situations, and even then a written excuse from the Dean of
Students or the Office of Student Experience will be required.
Coverage: Lectures 1-21, Labs 1-10, HW 1-8.
Closed-book and closed-notes except for 1 sheet of notes on 8.5x11 inch paper (front & back) that may be
handwritten or printed. Computers, cell-phones, calculators, music players, etc. are not permitted and must
be turned o. All students must bring their Rensselaer photo ID card.
Practice problems from previous exams are available on the course website. Solutions to the problems will be
posted on Sunday evening.
Todays Lecture
Finish last lecture!
Shortest path to leaf, iterators, representing the parent
Operators as non-member functions, as member functions, and as friend functions.
We can also write them as member functions (e.g., operator+). When implemented as a member function, the
expression: z + w is translated into: z.operator+ (w)
This shows that operator+ is a member function of z, since z appears on the left-hand side of the operator.
Observe that the function has only one argument!
There are several important properties of the implementation of an operator as a member function:
It is within the scope of class Complex, so private member variables can be accessed directly.
The member variables of z, whose member function is actually called, are referenced by directly by name.
The member variables of w are accessed through the parameter rhs.
The member function is const, which means that z will not (and can not) be changed by the function.
Also, since w will not be changed since the argument is also marked const.
Both operator+ and operator- return Complex objects, so both must call Complex constructors to create these
objects. Calling constructors for Complex objects inside functions, especially member functions that work on
Complex objects, seems somewhat counter-intuitive at first, but it is common practice!
2
21.5 Assignment Operators
The assignment operator: z1 = z2; becomes a function call: z1.operator=(z2);
And cascaded assignments like: z1 = z2 = z3; are really: z1 = (z2 = z3);
which becomes: z1.operator= (z2.operator= (z3));
Studying these helps to explain how to write the assignment operator, which is usually a member function.
The argument (the right side of the operator) is passed by constant reference. Its values are used to change
the contents of the left side of the operator, which is the object whose member function is called. A reference
to this object is returned, allowing a subsequent call to operator= (z1s operator= in the example above).
The identifier this is reserved as a pointer inside class scope to the object whose member function is called.
Therefore, *this is a a reference to this object.
The fact that operator= returns a reference allows us to write code of the form: (z1 = z2).real();
21.6 Exercise
Write an operator+= as a member function of the Complex class. To do so, you must combine what you learned
about operator= and operator+. In particular, the new operator must return a reference, *this.
class Foo {
public:
friend class Bar;
...
};
This allows member functions in class Bar to access all of the private member functions and variables of a Foo
object as though they were public (but not vice versa). Note that Foo is giving friendship (access to its private
contents) rather than Bar claiming it. What could go wrong if we allowed friendships to be claimed?
Alternatively, within the definition of the class, we can designate specific functions to be friends, which
grants these functions access similar to that of a member function. The most common example of this is
operators, and especially stream operators.
3
If we wanted to make one of these stream operators a regular member function, it would have to be a member
function of the ostream class because this is the first argument (left operand). We cannot make it a member
function of the Complex class. This is why stream operators are never member functions.
Stream operators are either ordinary non-member functions (if the operators can do their work through the
public class interface) or friend functions (if they need non public access).
The most important rule for clean class design involving operators is to NEVER change the intuitive
meaning of an operator. The whole point of operators is lost if you do. One (bad) example would be
defining the increment operator on a Complex number.
4
CSCI-1200 Data Structures Spring 2017
Lecture 22 Hash Tables
Review from Lecture 21
Finishing binary search trees & the ds set class
Operators as non-member functions, as member functions.
Todays Lecture
the single most important data structure known to mankind
Hash Tables, Hash Functions, and Collision Resolution
Performance of: Hash Tables vs. Binary Search Trees
Ideally the function will uniformly distribute the keys throughout the range of legal index values (0 ! k-1).
Whats a collision?
When the hash function maps multiple (dierent) keys to the same index.
How do we deal with collisions?
One way to resolve this is by storing a linked list of values at each slot in the array.
Well review how we solved this problem in Lab 9 with an STL vector then an STL map. Finally, well
implement the system with a hash table.
Exercise: Whats the memory usage for the vector-based Caller ID system?
Whats the expected running time for find, insert, and erase?
Exercise: Whats the memory usage for the map-based Caller ID system?
Whats the expected running time for find, insert, and erase?
#define PHONEBOOK_SIZE 10 0
5182764321
class Node { 1 dan
public:
int number;
2 6175551212
fred
string name; 3
Node* next;
}; 4 5182761234
alice
5
// create the phonebook, initially all numbers are unassigned
Node* phonebook[PHONEBOOK_SIZE]; 6
for (int i = 0; i < PHONEBOOK_SIZE; i++) { 5182761267
phonebook[i] = NULL;
7 carol
} 8 5182765678
bob
5182764488
erin
9
2
// corresponds a phone number to a slot in the array
int hash_function(int number) {
Whats the expected running time for find, insert, and erase?
3
Another example of a dangerous hash function on string keys is to add or multiply the ascii values of each char:
The problem is that dierent permutations of the same string result in the same hash table location.
This can be improved through multiplications that involve the position and value of the key:
The 2nd method is better, but can be improved further. The theory of good hash functions is quite involved
and beyond the scope of this course.
Quadratic probing: If i is the hash location then the following sequence of table locations is tested:
(i+1)%N, (i+2*2)%N, (i+3*3)%N, (i+4*4)%N, ...
More generally, the j th probe of the table is (i + c1 j + c2 j 2 ) mod N where c1 and c2 are constants.
Secondary hashing: when a collision occurs a second hash function is applied to compute a new table
location. This is repeated until an empty location is found.
For each of these approaches, the find operation follows the same sequence of locations as the insert operation.
The key value is determined to be absent from the table only when an empty location is found.
When using open addressing to resolve collisions, the erase function must mark a location as formerly occu-
pied. If a location is instead marked empty, find may fail to return elements in the table. Formerly-occupied
locations may (and should) be reused, but only after the find operation has been run to completion.
4
22.12 Hash Table in STL?
The Standard Template Library standard and implementation of hash table have been slowly evolving over
many years. Unfortunately, the names hashset and hashmap were spoiled by developers anticipating the
STL standard, so to avoid breaking or having name clashes with code using these early implementations...
STLs agreed-upon standard for hash tables: unordered set and unordered map
Depending on your OS/compiler, you may need to add the -std=c++11 flag to the compile line (or other
configuration tweaks) to access these more recent pieces of STL. (And this will certainly continue to evolve
in future years!) Also, for many types STL has a good default hash function, so you may not always need to
specify both template parameters!
We use separate chaining for collision resolution. Hence the main data structure inside the class is:
std::vector< std::list<KeyType> > m_table;
We will use automatic resizing when our table is too full. Resize is expensive of course, so similar to the auto-
matic reallocation that occurs inside the vector push_back function, we at least double the size of underlying
structure to ensure it is rarely needed.
Once our new type containing the hash function is defined, we can create instances of our hash set object
containing std::string by specifying the type hash_string_obj as the second template parameter to the
declaration of a ds_hashset. E.g.,
5
The iterator must store:
A pointer to the hash table it is associated with. This reflects a subtle point about types: even though
the iterator class is declared inside the ds_hashset, this does not mean an iterator automatically knows
about any particular ds_hashset.
The index of the current list in the hash table.
An iterator referencing the current location in the current list.
Because of the way the classes are nested, the iterator class object must declare the ds_hashset class as a
friend, but the reverse is unnecessary.
end(): Also associates the iterator with the specific table, assigns an index of -1 (indicating it is not a normal
valid index), and thus does not assign the particular list iterator.
Exercise: Implement the begin() function.
The decrement operator must check if the iterator in the list is at the beginning and if so it must proceed to
find the previous non-empty list and then find the last entry in that list. This might sound expensive, but
remember that the lists should be very short.
The comparison operators must accommodate the fact that when (at least) one of the iterators is the end, the
internal list iterator will not have a useful value.
If the key is not in the list at the index location, then the key should be inserted in the list (at the front is
fine), and an iterator is created referencing the location of the newly-inserted key a pair is returned with this
iterator and true.
Exercise: Implement the insert() function, ignoring for now the resize operation.
Find is similar to insert, computing the hash function and index, followed by a std::find operation.
22.19 Erase
Two versions are implemented, one based on a key value and one based on an iterator. These are based on
finding the appropriate iterator location in the appropriate list, and applying the list erase function.
22.20 Resize
Must copy the contents of the current vector into a scratch vector, resize the current vector, and then re-insert
each key into the resized vector. Exercise: Write resize()
6
ds_hashset_lec.h Thu Apr 06 21:41:08 2017 1
#ifndef ds_hashset_h_ // increment and decrement
#define ds_hashset_h_ iterator& operator++() {
// The set class as a hash table instead of a binary search tree. The this->next();
// primary external difference between ds_set and ds_hashset is that return *this;
// the iterators do not step through the hashset in any meaningful }
// order. It is just the order imposed by the hash function. iterator operator++(int) {
#include <iostream> iterator temp(*this);
#include <list> this->next();
#include <string> return temp;
#include <vector> }
iterator & operator--() {
// The ds_hashset is templated over both the type of key and the type this->prev();
// of the hash function, a function object. return *this;
template < class KeyType, class HashFunc > }
class ds_hashset { iterator operator--(int) {
private: iterator temp(*this);
typedef typename std::list<KeyType>::iterator hash_list_itr; this->prev();
return temp;
public: }
// =================================================================
// THE ITERATOR CLASS private:
// Defined as a nested class and thus is not separately templated. // Find the next entry in the table
void next() {
class iterator { ++ m_list_itr; // next item in the list
public:
friend class ds_hashset; // allows access to private variables // If we are at the end of this list
private: if (m_list_itr == m_hs->m_table[m_index].end()) {
// Find the next non-empty list in the table
// ITERATOR REPRESENTATION for (++m_index;
ds_hashset* m_hs; m_index < int(m_hs->m_table.size()) && m_hs->m_table[m_index].empty();
int m_index; // current index in the hash table ++m_index) {}
hash_list_itr m_list_itr; // current iterator at the current index
// If one is found, assign the m_list_itr to the start
private: if (m_index != int(m_hs->m_table.size()))
// private constructors for use by the ds_hashset only m_list_itr = m_hs->m_table[m_index].begin();
iterator(ds_hashset * hs) : m_hs(hs), m_index(-1) {}
iterator(ds_hashset* hs, int index, hash_list_itr loc) // Otherwise, we are at the end
: m_hs(hs), m_index(index), m_list_itr(loc) {} else
m_index = -1;
public: }
// Ordinary constructors & assignment operator }
iterator() : m_hs(0), m_index(-1) {}
iterator(iterator const& itr) // Find the previous entry in the table
: m_hs(itr.m_hs), m_index(itr.m_index), m_list_itr(itr.m_list_itr) {} void prev() {
iterator& operator=(const iterator& old) { // If we arent at the start of the current list, just decrement
m_hs = old.m_hs; // the list iterator
m_index = old.m_index; if (m_list_itr != m_hs->m_table[m_index].begin())
m_list_itr = old.m_list_itr; m_list_itr -- ;
return *this;
} else {
// Otherwise, back down the table until the previous
// The dereference operator need only worry about the current // non-empty list in the table is found
// list iterator, and does not need to check the current index. for (--m_index; m_index >= 0 && m_hs->m_table[m_index].empty(); --m_index) {}
const KeyType& operator*() const { return *m_list_itr; }
// Go to the last entry in the list.
// The comparison operators must account for the list iterators m_list_itr = m_hs->m_table[m_index].begin();
// being unassigned at the end. hash_list_itr p = m_list_itr; ++p;
friend bool operator== (const iterator& lft, const iterator& rgt) for (; p != m_hs->m_table[m_index].end(); ++p, ++m_list_itr) {}
{ return lft.m_hs == rgt.m_hs && lft.m_index == rgt.m_index && }
(lft.m_index == -1 || lft.m_list_itr == rgt.m_list_itr); } }
friend bool operator!= (const iterator& lft, const iterator& rgt) };
{ return lft.m_hs != rgt.m_hs || lft.m_index != rgt.m_index || // end of ITERATOR CLASS
(lft.m_index != -1 && lft.m_list_itr != rgt.m_list_itr); } // =================================================================
ds_hashset_lec.h Thu Apr 06 21:41:08 2017 2
private: // Erase the key
// ================================================================= int erase(const KeyType& key) {
// HASH SET REPRESENTATION // Find the key and use the erase iterator function.
std::vector< std::list<KeyType> > m_table; // actual table iterator p = find(key);
HashFunc m_hash; // hash function if (p == end())
unsigned int m_size; // number of keys return 0;
else {
public: erase(p);
// ================================================================= return 1;
// HASH SET IMPLEMENTATION }
}
// Constructor for the table accepts the size of the table. Default
// constructor for the hash function object is implicitly used. // Erase at the iterator
ds_hashset(unsigned int init_size = 10) : m_table(init_size), m_size(0) {} void erase(iterator p) {
m_table[ p.m_index ].erase(p.m_list_itr);
// Copy constructor just uses the member function copy constructors. }
ds_hashset(const ds_hashset<KeyType, HashFunc>& old)
: m_table(old.m_table), m_size(old.m_size) {} // Find the first entry in the table and create an associated iterator
iterator begin() {
ds_hashset() {} // implemented in lecture or lab
private:
// resize the table with the same values but a
void resize_table(unsigned int new_size) {
// implemented in lecture or lab
}
// Find the key, using hash function, indexing and list find
iterator find(const KeyType& key) {
unsigned int hash_value = m_hash(key);
unsigned int index = hash_value % m_table.size();
hash_list_itr p = std::find(m_table[index].begin(),
m_table[index].end(), key);
if (p == m_table[index].end())
return this->end();
else
return iterator(this, index, p); }
} };
#endif
CSCI-1200 Data Structures Spring 2017
Lecture 23 Functors & Hash Tables, part II
Review from Lecture 22
Hash Tables, Hash Functions, and Collision Resolution
Performance of: Hash Tables vs. Binary Search Trees
Collision resolution: separate chaining vs open addressing
STLs unordered_set (and unordered_map)
Todays Lecture
Using STLs for_each
Something weird & cool in C++... Function Objects, a.k.a. Functors
Continuing with Hash Tables...
STLs unordered_set (and unordered_map)
Using a hash table to implement a set/map
Hash functions as functors/function objects
Iterators, find, insert, and erase
std::vector<float> my_data;
my_data.push_back(3.14);
my_data.push_back(1.41);
my_data.push_back(6.02);
my_data.push_back(2.71);
Now we can write a loop to print out all the data in our vector:
std::vector<float>::iterator itr;
for (itr = my_data.begin(); itr != my_data.end(); itr++) {
float_print(*itr);
}
Alternatively we can use it with STLs for_each function to visit and print each element:
Wow! Thats a lot less to type. Can I stop using regular for and while loops altogether?
We can actually also do the same thing without creating & explicitly naming the float_print function. We
create an anonymous function using lambda:
Lambda is new to the C++ language (part of C++11). But lambda is a core piece of many classic, older
programming languages including Lisp and Scheme. Python lambdas and Perl anonymous subroutines are
similar. (In fact lambda dates back to the 1930s, before the first computers were built!) Youll learn more
about lambda more in later courses like CSCI 4430 Programming Languages!
23.2 Function Objects, a.k.a. Functors
In addition to the basic mathematical operators + - * / < > , another operator we can overload for our C++
classes is the function call operator.
Why do we want to do this? This allows instances or objects of our class, to be used like functions. Its weird
but powerful.
Heres the basic syntax. Any specific number of arguments can be used.
class my_class_name {
public:
// ... normal class stuff ...
my_return_type operator() ( /* my list of args */ );
};
Remember how we can sort the my_data vector defined above using our own homemade comparison function
for sorting:
std::sort(my_data.begin(),my_data.end(),float_less);
std::sort(my_data.begin(),my_data.end());
std::sort(my_data.begin(),my_data.end(),std::less<float>());
What is std::less? Its a templated class. Above we have called the default constructor to make an instance
of that class. Then, that instance/object can be used like its a function. Weird!
How does it do that? std::less is a teeny tiny class that just contains the overloaded function call operator.
You can use this instance/object/functor as a function that expects exactly two arguments of type T (in this
example float) that returns a bool. Thats exactly what we need for std::sort! This ultimately does the
same thing as our tiny helper homemade compare function!
class between_values {
private:
float low, high;
public:
between_values(float l, float h) : low(l), high(h) {}
bool operator() (float val) { return low <= val && val <= high; }
};
2
The range between low & high is specified when a functor/an instance of this class is created. We might
have multiple dierent instances of the between_values functor, each with their own range. Later, when the
functor is used, the query value will be passed in as an argument. The function call operator accepts that
single argument val and compares against the internal data low & high.
This can be used in combination with STLs find_if construct. For example:
between_values two_and_four(2,4);
Alternatively, we could create the functor without giving it a variable name. And in the use below we also
capture the return value to print out the first item in the vector inside this range. Note that it does not print
all values in the range.
std::vector<float>::iterator itr;
itr = std::find_if(my_data.begin(), my_data.end(), between_values(2,4));
if (itr != my_data.end()) {
std::cout << "my_data contains " << *itr
<< ", a value greater than 2 & less than 4!" << std::endl;
}
Using a home-made std::string hash function. Note: We are required to specify the initial table size.
Manually specifying the hash function type.
std::unordered_map<std::string,Foo,std::function<unsigned int(std::string)> > m(1000, MyHashFunction);
Note: In the above examples were creating a association between two types (STL strings and custom Foo
object). If youd like to just create a set (no associated 2nd type), simply switch from unordered_map to
unordered_set and remove the Foo from the template type in the examples above.
3
CSCI-1200 Data Structures Spring 2017
Lecture 24 Priority Queues
Review from Lectures 22 & 23
Hash Tables, Hash Functions, and Collision Resolution
Performance of: Hash Tables vs. Binary Search Trees
Collision resolution: separate chaining vs open addressing
STLs unordered_set (and unordered_map)
Using a hash table to implement a set/map
Hash functions as functors/function objects
Iterators, find, insert, and erase
Using STLs for_each
Something weird & cool in C++... Function Objects, a.k.a. Functors
Todays Lecture
STL Queue and STL Stack
Definition of a Binary Heap
Whats a Priority Queue?
A Priority Queue as a Heap
A Heap as a Vector
Building a Heap
Heap Sort
If time allows... Merging heaps are the motivation for leftist heaps
Use an STL stack to print the elements with a pre-order traversal ordering. This is straightforward.
Use an STL stack to print the elements with an in-order traversal ordering. This is more complicated.
Use an STL queue to print the elements with a breadth-first traversal ordering.
24.3 Whats a Priority Queue?
Priority queues are used in prioritizing operations. Examples include a personal to do list, what order to do
homework assignments, jobs on a shop floor, packet routing in a network, scheduling in an operating system,
or events in a simulation.
Among the data structures we have studied, their interface is most similar to a queue, including the idea of a
front or top and a tail or a back.
Each item is stored in a priority queue using an associated priority and therefore, the top item is the one
with the lowest value of the priority score. The tail or back is never accessed through the public interface to
a priority queue.
The main operations are insert or push, and pop (or delete_min).
The latter is the better solution, but we would like to improve upon it for example, it might be more natural
if the minimum priority value were stored at the root.
We will achieve this with binary heap, giving up the complete ordering imposed in the binary search tree.
Draw several other trees with these values that not binary heaps.
2
24.7 Implementing Pop (a.k.a. Delete Min)
The value at the top (root) of the tree is replaced by the value stored in the last leaf node.
This has echoes of the erase function in binary search trees.
The last leaf node is removed.
QUESTION: But how do we find the last leaf ? Ignore this for now...
The value now at the root likely breaks the heap property. We use the percolate_down function to restore
the heap property. This function is written here in terms of tree nodes with child pointers (and the priority
stored as a value), but later it will be written in terms of vector subscripts.
percolate_down(TreeNode<T> * p) {
while (p->left) {
TreeNode<T>* child;
// Choose the child to compare against
if (p->right && p->right->value < p->left->value)
child = p->right;
else
child = p->left;
if (child->value < p->value) {
swap(child, p); // value and other non-pointer member vars
p = child;
}
else
break;
}
}
But, percolate_up (and as a result push) is O(1) in the average case. Why?
3
24.11 Implementing a Heap with a Vector (instead of Nodes & Pointers)
In the vector implementation, the tree is never explicitly constructed. Instead the heap is stored as a vector,
and the child and parent pointers can be implicitly calculated.
To do this, number the nodes in the tree starting with 0 first by level (top to bottom) and then scanning across
each row (left to right). These are the vector indices. Place the values in a vector in this order.
As a result, for each subscript, i,
The parent, if it exists, is at location b(i 1)/2c.
The left child, if it exists, is at location 2i + 1.
The right child, if it exists, is at location 2i + 2.
For a binary heap containing n values, the last leaf is at location n 1 in the vector and the last internal
(non-leaf) node is at location b(n 1)/2c.
The standard library (STL) priority_queue is implemented as a binary heap.
Starting with an initially empty heap, show the vector contents for the binary heap after each delete min
operation.
If instead, we ran percolate_up from each index starting at index 0 through index n-1, we would get properly
organized heap data, but incur a O(n log n) cost. Why?
4
24.14 Heap Sort
Heap Sort is a simple algorithm to sort a vector of values: Build a heap and then run n consecutive pop
operations, storing each popped value in a new vector.
It is straightforward to show that this requires O(n log n) time.
Exercise: Implement an in-place heap sort. An in-place algorithm uses only the memory holding the input
data a separate large temporary vector is not needed.
Heaps, which are conceptually a binary tree but are implemented in a vector, are the data structure of choice
for a priority queue.
In some applications, the priority of an entry may change while the entry is in the priority queue. This requires
that there be hooks (usually in the form of indices) into the internal structure of the priority queue. This is
an implementation detail we have not discussed.
5
CSCI-1200 Data Structures Spring 2017
Lecture 25 C++ Inheritance and Polymorphism
Review from Lecture 24
STL Queues and STL Stacks
Definition of a Binary Heap
Building a Heap
Heap Sort
Todays Class
Inheritance is a relationship among classes. Examples: bank accounts, polygons, stack & list
Basic mechanisms of inheritance
Types of inheritance
Is-A, Has-A, As-A relationships among classes.
Polymorphism
SavingsAccount is a derived class from Account. SavingsAccount has inherited member variables & functions
and ordinarily-defined member variables & functions.
The member variable balance in base class Account is protected, which means:
balance is NOT publicly accessible outside the class, but it is accessible in the derived classes.
if balance was declared as private, then SavingsAccount member functions could not access it.
When using objects of type SavingsAccount, the inherited and derived members are treated exactly the same
and are not distinguishable.
CheckingAccount is also a derived class from base class Account.
TimeAccount is derived from SavingsAccount. SavingsAccount is its base class and Account is its indirect
base class.
25.3 Exercise: Draw the Accounts Class Hierarchy
#include <iostream>
// Note we've inlined all the functions (even though some are > 1 line of code)
class Account {
public:
Account(double bal = 0.0) : balance(bal) {}
void deposit(double amt) { balance += amt; }
double get_balance() const { return balance; }
protected:
double balance; // account balance
};
2
double withdraw(double amt) {
if (amt <= funds_avail) {
funds_avail -= amt;
balance -= amt;
return amt;
} else {
return 0.0;
}
}
double get_avail() const { return funds_avail; };
protected:
double funds_avail; // amount available for withdrawal
};
Private inheritance hides the std::list<T> member functions from the outside world. However, these member
functions are still available to the member functions of the stack<T> class.
Note: no member variables are defined the only member variables needed are in the list class.
3
When the stack member function uses the same name as the base class (list) member function, the name of
the base class followed by :: must be provided to indicate that the base class member function is to be used.
The copy constructor just uses the copy constructor of the base class, without any special designation because
the stack object is a list object as well.
B
virtual virtual
A C D
B E G
C F
an instance of class C an instance of class F an instance of class F an instance of class G
A D A
C D
A A A
B G B
E
B B B
C
F Note that even if a class
C D does not itself use multiple
Normally, inheritance just
adds layers, like an onion inheritance, it may still
E Instead, we inherit virtually, which have virtual inheritance
or a nesting doll. requires separate construction of the on its path and require
In each layer, we store F parts of the diagram marked virtual. separate construction.
the member variables This ensures we have a single
for that class. With multiple inheritance, this could lead to unambiguous copy of the member
duplicate copies of the member variables for variable data for A & B.
classes A & B.
4
25.11 Introduction to Polymorphism
Lets consider a small class hierarchy version of polygonal objects:
class Polygon {
public:
Polygon() {}
virtual ~Polygon() {}
int NumVerts() { return verts.size(); }
virtual double Area() = 0;
virtual bool IsSquare() { return false; }
protected:
vector<Point> verts;
};
Functions that are common, at least have a common interface, are in Polygon.
Some of these functions are marked virtual, which means that when they are redefined by a derived class,
this new definition will be used, even for pointers to base class objects.
Some of these virtual functions, those whose declarations are followed by = 0 are pure virtual, which means
they must be redefined in a derived class.
Any class that has pure virtual functions is called abstract.
Objects of abstract types may not be created only pointers to these objects may be created.
Functions that are specific to a particular object type are declared in the derived class prototype.
std::list<Polygon*> polygons;
Objects are constructed using new and inserted into the list:
Note: Weve used the same pointer variable (p ptr) to point to objects of two dierent types.
5
25.13 Accessing Objects Through a Polymorphic List of Pointers
Lets sum the areas of all the polygons:
double area = 0;
for (std::list<Polygon*>::iterator i = polygons.begin(); i!=polygons.end(); ++i)
area += (*i)->Area();
Which Area function is called? If *i points to a Triangle object then the function defined in the Triangle
class would be called. If *i points to a Quadrilateral object then Quadrilateral::Area will be called.
25.14 Exercise
What is the output of the following program?
class Base {
public:
Base() {}
virtual void A() { std::cout << "Base A "; }
void B() { std::cout << "Base B "; }
};
int main() {
Base* a[3];
a[0] = new Base;
a[1] = new One;
a[2] = new Two;
for (unsigned int i=0; i<3; ++i) {
a[i]->A();
a[i]->B();
}
std::cout << std::endl;
return 0;
}
6
CSCI-1200 Data Structures Spring 2017
Lecture 26 C++ Exceptions
Review from Lecture 25
Inheritance is a relationship among classes. Examples: bank accounts, polygons, stack & list
Basic mechanisms of inheritance
Types of inheritance
Is-A, Has-A, As-A relationships among classes.
Polymorphism
Todays Class
Error handling strategies
Basic exception mechanisms: try/throw/catch
Functions & exceptions, constructors & exceptions
STL exceptions
RAII Resource Acquisition is Initialization
Structured Exception Handling in the Windows Operating System
Googles C++ Style Guide on Exceptions
Some examples from todays lecture are drawn from:
http://www.cplusplus.com/doc/tutorial/exceptions/
http://www.parashift.com/c++-faq-lite/exceptions.html
For small programs, for short term use, by a single programmer, where the input is well known and controlled,
this may not be a disaster (and is often fastest to develop and thus a good choice).
But for large programs, this code will be challenging to maintain. It can be difficult to pinpoint the source
of an error. The symptom of a problem (if noticed at all) may be many steps removed from the source. The
software system maintainer must be familiar with the assumptions of the code (which is difficult if there is a
ton of code, the code was written some time ago, by someone else, or is not sufficiently commented... or all of
the above!).
26.2 Error Handling Strategy B: Plan for the Worst Case (a.k.a. Paranoia)
Anticipate every mistake or source of error (or as many as you can think of). Write lots of if statements
everywhere there may be a problem. Write code for what to do instead, print out error messages, and/or
exit when nothing seems reasonable.
double answer;
// for some application specific epsilon (often not easy to specify)
double epsilon = 0.00001;
if (fabs(denom) < epsilon) {
std::cerr << "detected a divide by zero error" << std::endl;
// what to do now? (often there is no "right" thing to do)
answer = 0;
} else {
answer = numer / denom;
}
Error checking & error handling generally requires a lot of programmer time to write all of this error code.
Creating a comprehensive test suite (yes, error checking/handling code must be tested too!) that exercises all
the error cases is extremely time consuming, and some error situations are very difficult to produce.
26.3 Error Handling Strategy C: If/When It Happens Well Fix It (a.k.a. Procrasti-
nation)
Again, anticipate everything that might go wrong and just call assert in lots of places. This can be
somewhat less work than the previous option (we dont need to decide what to do if the error happens, the
program just exits immediately).
This can be a great tool during the software development process. Write code to test all (or most) of the
assumptions in each function/code unit. Quickly get a prototype system up and running that works for the
general, most common, non-error cases first.
If/when an unexpected input or condition occurs, then additional code can be written to more appropriately
handle special cases and errors.
However, the use of assertions is generally frowned upon in real-world production code (users dont like to
receive seemingly arbitrary & total system failures, especially when they paid for the software!).
Once you have completed testing & debugging, and are fairly confident that the likely error cases are appropri-
ately handled, then the gcc compile flag -DNDEBUG flag can be used to remove all remaining assert statements
before compiling the code (conveniently removing any performance overhead for assert checking).
throw 20;
throw std::string("hello");
throw Foo(2,5);
You can throw a value of any type (e.g., int, std::string, an instance of a custom class, etc.)
When the throw statement is triggered, the rest of that block of code is abandoned.
2
26.6 Basic Exception Mechanisms: Try/Catch
If you suspect that a fragment of code you are about to execute may throw an exception and you want to
prevent the program from crashing, you should wrap that fragment within a try/catch block:
try {
/* the code that might throw */
}
catch (int x) {
/* what to do if the throw happened
(may use the variable x)
*/
}
/* the rest of the program */
The logic of the try block may throw more than one type of exception.
A catch statement specifies what type of exception it catches (e.g., int, std::string, etc.)
You may use multiple catch blocks to catch dierent types of exceptions from the same try block.
You may use catch (...) { /* code */ } to catch all types of exceptions. (But you dont get to use the
value that was thrown!)
If an exception is thrown, the program searches for the closest enclosing try/catch block with the appropriate
type. That try/catch may be several functions away on the call stack (it might be all the way back in the main
function!).
If no appropriate catch statement is found, the program exits, e.g.:
terminate called after throwing an instance of 'bool'
Abort trap
int main() {
try {
my_func(1,2);
}
catch (double x) {
std::cout << " caught a double " << x << std::endl;
}
catch (...) {
std::cout << " caught some other type " << std::endl;
}
}
If you use the throw syntax in the prototype, and the function throws an exception of a type that you have
not listed, the program will terminate immediately (it cant be caught by any enclosing try statements).
If you dont use the throw syntax in the prototype, the function may throw exceptions of any type, and they
may be caught by an appropriate try/catch block.
3
26.8 Comparing Method B (explicit if tests) to Method D (exceptions)
Heres code using exceptions to sort a collection of lines by slope:
class Point {
public:
Point(double x_, double y_) : x(x_),y(y_) {}
double x,y;
};
class Line {
public:
Line(const Point &a_, const Point &b_) : a(a_),b(b_) {}
Point a,b;
};
int main () {
std::vector<Line> lines;
/* omitting code to initialize some data */
try {
organize(lines);
/* omitting code to print out the results */
} catch (int) {
std::cout << "error: infinite slope" << std::endl;
}
}
Specifically note the behavior if one of the lines has infinite slope (a vertical line).
Note also how the exception propagates out through several nested function calls.
Exercise: Rewrite this code to have the same behavior but without exceptions. Try to preserve the overall
structure of the code as much as possible. (Hmm... its messy!)
4
26.9 STL exception Class
STL provides a base class std::exception in the <exception> header file. You can derive your own exception
type from the exception class, and overwrite the what() member function
class myexception: public std::exception {
virtual const char* what() const throw() {
return "My exception happened";
}
};
int main () {
myexception myex;
try {
throw myex;
}
catch (std::exception& e) {
std::cout << e.what() << std::endl;
}
return 0;
}
The STL library throws several dierent types of exceptions (all derived from the STL exception class):
bad alloc thrown by new on allocation failure
bad cast thrown by dynamic cast (when casting to a reference variable rather than a pointer)
bad exception thrown when an exception type doesnt match any catch
bad typeid thrown by typeid
ios base::failure thrown by functions in the iostream library
It can also be useful to have the constructor for a custom class throw a descriptive exception if the arguments
are invalid in some way.
Variables allocated on the stack (not dynamically-allocated using new) are guaranteed to be properly destructed
when the variable goes out of scope (e.g., when an exception is thrown and we abandon a partially executed
block of code or function).
Special care must be taken for dynamically-allocated variables (and other resources like open files, mutexes,
etc.) to ensure that the code is exception safe.
5
26.12 Structured Exception Handing (SEH) in the Windows Operating System
The Windows Operating System has special language support, called Structured Exception Handing (SEH), to
handle hardware exceptions. Some examples of hardware exceptions include divide by zero and segmentation
faults (there are others!).
In Unix/Linux/Mac OSX these hardware exceptions are instead dealt with using signal handlers. Unfortunately,
writing error handling code using signal handlers incurs a larger performance hit (due to setjmp) and the design
of the error handling code is less elegant than the usual C++ exception system because signal handlers are
global entities.
6
CSCI-1200 Data Structures Spring 2017
Lecture 27 Garbage Collection & Smart Pointers
Announcements
Please fill out your course evaluations!
Those of you interested in becoming an undergraduate mentor for Data Structures, or another CSCI course:
Speak to your graduate lab TA and ask him/her to recommend you for the position.
A week or two before the start of the Spring term, David Goldschmidt will post the online application for
mentors for CS1, DS, and other CSCI courses. Hell send it to the CSCI undergraduate mailing list, but
it will (probably) also be posted on Facebook & Reddit.
The final exam pratice problems will be posted on the calendar this afternoon.
If we get at least 85% response to the course evaluations, we will post the solutions early.
Todays Lecture
What is Garbage?
With C++, the programmer is expected to perform explicit memory management. You must use delete when
you are done with dynamically allocated memory (which was created with new).
In Java, and other languages with garbage collection, you are not required to explicitly de-allocate the
memory. The system automatically determines what is garbage and returns it to the available pool of memory.
Certainly this makes it easier to learn to program in these languages, but automatic memory management does
have performance and memory usage disadvantages.
Today well overview 3 basic techniques for automatic memory management.
class Node {
public:
Node(char v, Node* l, Node* r) :
value(v), left(l), right(r) {}
char value;
Node* left;
Node* right;
};
27.3 Garbage Collection Technique #1: Reference Counting
1. Attach a counter to each Node in memory.
Node
2. When a new pointer is connected to that Node, increment the counter.
value:
2
3. When a pointer is removed, decrement the counter. left:
4. Any Node with counter == 0 is garbage and is available for reuse. right:
root: 105
2
27.6 Garbage Collection Technique #2: Stop and Copy
1. Split memory in half (working memory and copy memory).
2. When out of working memory, stop computation and begin garbage collection.
(a) Place scan and free pointers at the start of the copy memory.
(b) Copy the root to copy memory, incrementing free. Whenever a node is copied from working memory,
leave a forwarding address to its new location in copy memory in the left address slot of its old location.
(c) Starting at the scan pointer, process the left and right pointers of each node. Look for their locations
in working memory. If the node has already been copied (i.e., it has a forwarding address), update the
reference. Otherwise, copy the location (as before) and update the reference.
(d) Repeat until scan == free.
(e) Swap the roles of the working and copy memory.
WORKING MEMORY
address 100 101 102 103 104 105 106 107
value a b c d e f g h
left 0 0 100 100 0 102 105 104
right 0 100 103 0 105 106 0 0
COPY MEMORY
address 108 109 110 111 112 113 114 115
value
left
right
root: 105
scan:
free:
4. Mark
(a) Start at the root and follow the accessible structure (keeping a stack of where you still need to go).
(b) Mark every node you visit.
(c) Stop when you see a marked node, so you dont go into a cycle.
5. Sweep
(a) Start at the end of memory, and build a new free list.
(b) If a node is unmarked, then its garbage, so hook it into the free list by chaining the left pointers.
3
27.9 Mark-Sweep Exercise
Lets perform Mark-Sweep on the following with root = 105:
root: 105
free:
stack:
4
27.12 Whats a Smart Pointer?
The goal is to create a widget that works just like a regular pointer most of the time, except at the beginning
and end of its lifetime. The syntax of how we construct smart pointers is a bit dierent and we dont need to
obsess about how & when it will get deleted (it happens automatically).
Heres one flavor of a smart pointer (much simplified from STL):
template <class T>
class auto_ptr {
public:
explicit auto_ptr(T* p = NULL) : ptr(p) {} /* prevents cast/conversion */
~auto_ptr() { delete ptr; }
T& operator*() { return *ptr; }
T* operator->() { return ptr; } /* fakes being a pointer */
private:
T* ptr;
};
And lets start with some example code without smart pointers:
void foo() {
Polygon* p(new Polygon(/* stuff */));
p->DoSomething();
delete p;
}
Heres how we can re-write the same example with our auto_ptr:
void foo() {
auto_ptr<Polygon> p(new Polygon(/* stuff */);
p->DoSomething();
}
We dont have to call delete! Theres no memory leak or memory error in this code. Awesome!
std::vector<Polygon*> polys;
polys.push_back(new Triangle(/*...*/));
polys.push_back(new Quad(/*...*/));
5
In contrast with smart pointers they will be deallocated automagically!
polys.push_back(shared_ptr<Polygon>(new Triangle(/*...*/)));
polys.push_back(shared_ptr<Polygon>(new Quad(/*...*/)));
auto_ptr
When copied (copy constructor), the new object takes ownership and the old object is now empty. Deprecated
in new C++ standard.
unique_ptr
Cannot be copied (copy constructor not public). Can only be moved to transfer ownership. Explicit ownership
transfer. Intended to replace auto_ptr. std::unique ptr has memory overhead only if you provide it with some
non-trivial deleter. It has time overhead only during constructor (if it has to copy the provided deleter) and
during destructor (to destroy the owned object).
scoped_ptr (Boost)
Remembers to delete things when they go out of scope. Alternate to auto_ptr. Cannot be copied.
shared_ptr
Reference counted ownership of pointer. Unfortunately, circular references are still a problem. Dierent sub-
flavors based on where the counter is stored in memory relative to the object, e.g., intrusive_ptr, which
is more memory efficient. std::unique ptr has memory overhead only if you provide it with some non-trivial
deleter. It has time overhead in constructor (to create the reference counter), in destructor (to decrement
the reference counter and possibly destroy the object) and in assignment operator (to increment the reference
counter).
weak_ptr
Use with shared_ptr. Memory is destroyed when no more shared_ptrs are pointing to object. So each time
a weak_ptr is used you should first lock the data by creating a shared_ptr.
scoped_array and shared_array (Boost)
6
CSCI-1200 Data Structures Spring 2017
Lecture 28 Concurrency & Asynchronous Computing
Final Exam General Information
The final exam will be held: Wednesday May 10th from 3-6pm. Your room and zone assignment
will be posted on the homework server next week.
A makeup exam will only be oered if required by the RPI rules regarding final exam conflicts -OR- if a written
excuse from the Dean of Students office is provided. Contact the ds instructors list by email immediately if
you have a conflict.
Please check the homework submission server data entry for your grades early next week. Email your lab TA
if there is any error before the final exam.
What happens when objects dont change one at a time but rather act concurrently?
We may be able to take advantage of this by letting threads/processes run at the same time
(a.k.a., in parallel).
However, we will need to think carefully about the interactions and shared resources.
28.3 Concurrency Example: Joint Bank Account
Consider the following bank account implementation:
class Account {
public:
Account(int amount) : balance(amount) {}
void deposit(int amount) {
int tmp = balance; // A
tmp += amount; // B
balance = tmp; // C
}
void withdraw(int amount) {
int tmp = balance; // D
if (amount > tmp)
cout << "Error: Insufficient Funds!" << endl; // E1
else {
tmp -= amount; // E2
}
balance = tmp; // F
}
private:
int balance;
};
Now, enumerate all of the possible interleavings of the sub-expressions (A-F) if the following two function calls
were to happen concurrently. What are the dierent outcomes?
account.deposit(50);
account.withdraw(125);
Exercise: What are the acceptable outcomes for the bank account example?
2
28.5 Serialization via a Mutex
We can serialize the important interactions using a primitive, atomic synchronization method called a mutex.
Once one thread has acquired the mutex (locking the resource), no other thread can acquire the mutex until it
has been released.
In the example below we use the STL mutex object (#include <mutex>). If the mutex is unavailable, the
call to the mutex member function lock() blocks (the thread pauses at that line of code until the mutex is
available).
class Chalkboard {
public:
Chalkboard() { }
void write(Drawing d) {
board.lock();
drawing = d;
board.unlock();
}
Drawing read() {
board.lock();
Drawing answer = drawing;
board.unlock();
return answer;
}
private:
Drawing drawing;
std::mutex board;
};
class Professor {
public:
Professor(Chalkboard *c) { chalkboard = c; }
virtual void Lecture(const std::string ¬es) {
chalkboard->write(notes);
}
protected:
Chalkboard* chalkboard;
};
class Student {
public:
Student(Chalkboard *c) { chalkboard = c; }
void TakeNotes() {
Drawing d = chalkboard->read();
notebook.push_back(d);
}
private:
Chalkboard* chalkboard;
std::vector<Drawing> notebook;
};
3
28.7 Launching Concurrent Threads
So how exactly do we get multiple streams of computation happening simultaneously? There are many choices
(may depend on your programming language, operating system, compiler, etc.).
Well use the STL thread library (#include <thread>). The new thread begins execution in the provided
function (student thread, in this example). We pass the necessary shared data from the main thread to the
secondary thread to facilitate communication.
#define num_notes 10
int main() {
Chalkboard chalkboard;
Professor prof(&chalkboard);
std::thread student(student_thread, &chalkboard);
for (int i = 0; i < num_notes; i++) {
prof.Lecture("blah blah");
}
student.join();
}
The join command pauses to wait for the secondary thread to finish computation before continuing with the
program (or exiting in this example).
What can still go wrong? How can we fix it?
class Chalkboard {
public:
Chalkboard() { student_done = true; }
void write(Drawing d) {
while (1) {
board.lock();
if (student_done) {
drawing = d;
student_done = false;
board.unlock();
return;
}
board.unlock();
}
}
Drawing read() {
while (1) {
board.lock();
if (!student_done) {
Drawing answer = drawing;
student_done = true;
board.unlock();
return answer;
}
board.unlock();
}
}
4
private:
Drawing drawing;
std::mutex board;
bool student_done;
};
Note: This implementation is actually quite inefficient due to busy waiting. A better solution is to use a
operating system-supported condition variable that yields to other threads if the lock is not available and is
signaled when the lock becomes available again. STL has a condition_variable type which allows you to
wait for or notify other threads that it may be time to resume computation.
5
28.11 Topics Covered
Algorithm analysis: big O notation; best case, average case, or worst case; algorithm running time or additional
memory usage
STL classes: string, vector, list, map, & set, (we talked about but did not practice using STL stack,
queue, unordered_set, unordered_map, & priority_queue)
C++ Classes: constructors (default, copy, & custom argument), assignment operator, & destructor, classes
with dynamically-allocated memory, operator overloading, inheritance, polymorphism
Subscripting (random-access, pointer arithmetic) vs. iteration
When choosing between algorithms and between container classes (data structures) you should consider:
efficiency,
naturalness of use, and
ease of programming.
Use classes with well-designed public and private member functions to encapsulate sections of code.
Writing your own container classes and data structures usually requires building linked structures and managing
memory through the big three:
copy constructor,
assignment operator, and
destructor.
When testing and debugging:
Test one function and one class at a time,
Figure out what your program actually does, not what you wanted it to do,
Use small examples and boundary conditions when testing, and
Find and fix the first mistake in the flow of your program before considering other apparent mistakes.
Above all, remember the excitement and satisfaction when your hard work and focused debugging is rewarded
with a program that demonstrates your technical mastery and realizes your creative problem solving skills!