Data Structures & Algorithm Analysis in C++ 4th Edition (eBook PDF)instant download
Data Structures & Algorithm Analysis in C++ 4th Edition (eBook PDF)instant download
https://ebooksecure.com/product/data-structures-algorithm-
analysis-in-c-4th-edition-ebook-pdf/
http://ebooksecure.com/product/data-structures-algorithm-
analysis-in-c-4th-edition-ebook-pdf/
http://ebooksecure.com/product/ebook-pdf-starting-out-with-java-
from-control-structures-through-data-structures-4th-edition/
https://ebooksecure.com/download/data-structures-ebook-pdf/
https://ebooksecure.com/download/c-programming-program-design-
including-data-structures-ebook-pdf/
(eBook PDF) Data Structures and Abstractions with Java
4th Edition
http://ebooksecure.com/product/ebook-pdf-data-structures-and-
abstractions-with-java-4th-edition/
http://ebooksecure.com/product/ebook-pdf-data-structures-and-
abstractions-with-java-4th-global-edition/
http://ebooksecure.com/product/ebook-pdf-data-structures-and-
other-objects-using-java-4th-edition/
http://ebooksecure.com/product/ebook-pdf-data-structures-and-
problem-solving-using-java-4th-edition/
https://ebooksecure.com/download/data-structures-algorithms-in-
python-ebook-pdf-2/
CO NTE NTS
Preface xv
1.7.3 Big-Five 46
Summary 46
Exercises 46
References 48
Summary 374
Exercises 375
References 376
Index 619
P R E FAC E
Purpose/Goals
The fourth edition of Data Structures and Algorithm Analysis in C++ describes data structures,
methods of organizing large amounts of data, and algorithm analysis, the estimation of the
running time of algorithms. As computers become faster and faster, the need for programs
that can handle large amounts of input becomes more acute. Paradoxically, this requires
more careful attention to efficiency, since inefficiencies in programs become most obvious
when input sizes are large. By analyzing an algorithm before it is actually coded, students
can decide if a particular solution will be feasible. For example, in this text students look at
specific problems and see how careful implementations can reduce the time constraint for
large amounts of data from centuries to less than a second. Therefore, no algorithm or data
structure is presented without an explanation of its running time. In some cases, minute
details that affect the running time of the implementation are explored.
Once a solution method is determined, a program must still be written. As computers
have become more powerful, the problems they must solve have become larger and more
complex, requiring development of more intricate programs. The goal of this text is to teach
students good programming and algorithm analysis skills simultaneously so that they can
develop such programs with the maximum amount of efficiency.
This book is suitable for either an advanced data structures course or a first-year
graduate course in algorithm analysis. Students should have some knowledge of inter-
mediate programming, including such topics as pointers, recursion, and object-based
programming, as well as some background in discrete math.
Approach
Although the material in this text is largely language-independent, programming requires
the use of a specific language. As the title implies, we have chosen C++ for this book.
C++ has become a leading systems programming language. In addition to fixing many
of the syntactic flaws of C, C++ provides direct constructs (the class and template) to
implement generic data structures as abstract data types.
The most difficult part of writing this book was deciding on the amount of C++ to
include. Use too many features of C++ and one gets an incomprehensible text; use too few
and you have little more than a C text that supports classes.
The approach we take is to present the material in an object-based approach. As such,
there is almost no use of inheritance in the text. We use class templates to describe generic
data structures. We generally avoid esoteric C++ features and use the vector and string
classes that are now part of the C++ standard. Previous editions have implemented class
templates by separating the class template interface from its implementation. Although
this is arguably the preferred approach, it exposes compiler problems that have made it xv
xvi Preface
difficult for readers to actually use the code. As a result, in this edition the online code
represents class templates as a single unit, with no separation of interface and implementa-
tion. Chapter 1 provides a review of the C++ features that are used throughout the text and
describes our approach to class templates. Appendix A describes how the class templates
could be rewritten to use separate compilation.
Complete versions of the data structures, in both C++ and Java, are available on
the Internet. We use similar coding conventions to make the parallels between the two
languages more evident.
r Chapter 4 includes implementation of the AVL tree deletion algorithm—a topic often
requested by readers.
r Chapter 5 has been extensively revised and enlarged and now contains material on
two newer algorithms: cuckoo hashing and hopscotch hashing. Additionally, a new
section on universal hashing has been added. Also new is a brief discussion of the
unordered_set and unordered_map class templates introduced in C++11.
r Chapter 6 is mostly unchanged; however, the implementation of the binary heap makes
use of move operations that were introduced in C++11.
r Chapter 7 now contains material on radix sort, and a new section on lower-bound
proofs has been added. Sorting code makes use of move operations that were
introduced in C++11.
r Chapter 8 uses the new union/find analysis by Seidel and Sharir and shows the
O( M α(M, N) ) bound instead of the weaker O( M log∗ N ) bound in prior editions.
r Chapter 12 adds material on suffix trees and suffix arrays, including the linear-time
suffix array construction algorithm by Karkkainen and Sanders (with implementation).
The sections covering deterministic skip lists and AA-trees have been removed.
r Throughout the text, the code has been updated to use C++11. Notably, this means
use of the new C++11 features, including the auto keyword, the range for loop, move
construction and assignment, and uniform initialization.
Overview
Chapter 1 contains review material on discrete math and recursion. I believe the only way
to be comfortable with recursion is to see good uses over and over. Therefore, recursion
is prevalent in this text, with examples in every chapter except Chapter 5. Chapter 1 also
includes material that serves as a review of basic C++. Included is a discussion of templates
and important constructs in C++ class design.
Chapter 2 deals with algorithm analysis. This chapter explains asymptotic analysis
and its major weaknesses. Many examples are provided, including an in-depth explana-
tion of logarithmic running time. Simple recursive programs are analyzed by intuitively
converting them into iterative programs. More complicated divide-and-conquer programs
are introduced, but some of the analysis (solving recurrence relations) is implicitly delayed
until Chapter 7, where it is performed in detail.
Preface xvii
Chapter 3 covers lists, stacks, and queues. This chapter includes a discussion of the STL
vector and list classes, including material on iterators, and it provides implementations
of a significant subset of the STL vector and list classes.
Chapter 4 covers trees, with an emphasis on search trees, including external search
trees (B-trees). The UNIX file system and expression trees are used as examples. AVL trees
and splay trees are introduced. More careful treatment of search tree implementation details
is found in Chapter 12. Additional coverage of trees, such as file compression and game
trees, is deferred until Chapter 10. Data structures for an external medium are considered
as the final topic in several chapters. Included is a discussion of the STL set and map classes,
including a significant example that illustrates the use of three separate maps to efficiently
solve a problem.
Chapter 5 discusses hash tables, including the classic algorithms such as sepa-
rate chaining and linear and quadratic probing, as well as several newer algorithms,
namely cuckoo hashing and hopscotch hashing. Universal hashing is also discussed, and
extendible hashing is covered at the end of the chapter.
Chapter 6 is about priority queues. Binary heaps are covered, and there is additional
material on some of the theoretically interesting implementations of priority queues. The
Fibonacci heap is discussed in Chapter 11, and the pairing heap is discussed in Chapter 12.
Chapter 7 covers sorting. It is very specific with respect to coding details and analysis.
All the important general-purpose sorting algorithms are covered and compared. Four
algorithms are analyzed in detail: insertion sort, Shellsort, heapsort, and quicksort. New to
this edition is radix sort and lower bound proofs for selection-related problems. External
sorting is covered at the end of the chapter.
Chapter 8 discusses the disjoint set algorithm with proof of the running time. This is a
short and specific chapter that can be skipped if Kruskal’s algorithm is not discussed.
Chapter 9 covers graph algorithms. Algorithms on graphs are interesting, not only
because they frequently occur in practice but also because their running time is so heavily
dependent on the proper use of data structures. Virtually all of the standard algorithms
are presented along with appropriate data structures, pseudocode, and analysis of running
time. To place these problems in a proper context, a short discussion on complexity theory
(including NP-completeness and undecidability) is provided.
Chapter 10 covers algorithm design by examining common problem-solving tech-
niques. This chapter is heavily fortified with examples. Pseudocode is used in these later
chapters so that the student’s appreciation of an example algorithm is not obscured by
implementation details.
Chapter 11 deals with amortized analysis. Three data structures from Chapters 4 and
6 and the Fibonacci heap, introduced in this chapter, are analyzed.
Chapter 12 covers search tree algorithms, the suffix tree and array, the k-d tree, and
the pairing heap. This chapter departs from the rest of the text by providing complete and
careful implementations for the search trees and pairing heap. The material is structured so
that the instructor can integrate sections into discussions from other chapters. For example,
the top-down red-black tree in Chapter 12 can be discussed along with AVL trees (in
Chapter 4).
Chapters 1 to 9 provide enough material for most one-semester data structures courses.
If time permits, then Chapter 10 can be covered. A graduate course on algorithm analysis
could cover chapters 7 to 11. The advanced data structures analyzed in Chapter 11 can
easily be referred to in the earlier chapters. The discussion of NP-completeness in Chapter 9
xviii Preface
is far too brief to be used in such a course. You might find it useful to use an additional
work on NP-completeness to augment this text.
Exercises
Exercises, provided at the end of each chapter, match the order in which material is pre-
sented. The last exercises may address the chapter as a whole rather than a specific section.
Difficult exercises are marked with an asterisk, and more challenging exercises have two
asterisks.
References
References are placed at the end of each chapter. Generally the references either are his-
torical, representing the original source of the material, or they represent extensions and
improvements to the results given in the text. Some references represent solutions to
exercises.
Supplements
The following supplements are available to all readers at http://cssupport.pearsoncmg.com/
Acknowledgments
Many, many people have helped me in the preparation of books in this series. Some are
listed in other versions of the book; thanks to all.
As usual, the writing process was made easier by the professionals at Pearson. I’d like
to thank my editor, Tracy Johnson, and production editor, Marilyn Lloyd. My wonderful
wife Jill deserves extra special thanks for everything she does.
Finally, I’d like to thank the numerous readers who have sent e-mail messages and
pointed out errors or inconsistencies in earlier versions. My website www.cis.fiu.edu/~weiss
will also contain updated source code (in C++ and Java), an errata list, and a link to submit
bug reports.
M.A.W.
Miami, Florida
C H A P T E R 1
Programming: A General
Overview
In this chapter, we discuss the aims and goals of this text and briefly review programming
concepts and discrete mathematics. We will . . .
r See that how a program performs for reasonably large input is just as important as its
performance on moderate amounts of input.
r Summarize the basic mathematical background needed for the rest of the book.
r Briefly review recursion.
r Summarize some important features of C++ that are used throughout the text.
1 2 3 4
1 t h i s
2 w a t s
3 o a h g
4 f g d t
because they are entirely impractical for input sizes that a third algorithm can handle in a
reasonable amount of time.
A second problem is to solve a popular word puzzle. The input consists of a two-
dimensional array of letters and a list of words. The object is to find the words in the puzzle.
These words may be horizontal, vertical, or diagonal in any direction. As an example, the
puzzle shown in Figure 1.1 contains the words this, two, fat, and that. The word this begins
at row 1, column 1, or (1,1), and extends to (1,4); two goes from (1,1) to (3,1); fat goes
from (4,1) to (2,3); and that goes from (4,4) to (1,1).
Again, there are at least two straightforward algorithms that solve the problem. For each
word in the word list, we check each ordered triple (row, column, orientation) for the pres-
ence of the word. This amounts to lots of nested for loops but is basically straightforward.
Alternatively, for each ordered quadruple (row, column, orientation, number of characters)
that doesn’t run off an end of the puzzle, we can test whether the word indicated is in the
word list. Again, this amounts to lots of nested for loops. It is possible to save some time
if the maximum number of characters in any word is known.
It is relatively easy to code up either method of solution and solve many of the real-life
puzzles commonly published in magazines. These typically have 16 rows, 16 columns, and
40 or so words. Suppose, however, we consider the variation where only the puzzle board is
given and the word list is essentially an English dictionary. Both of the solutions proposed
require considerable time to solve this problem and therefore might not be acceptable.
However, it is possible, even with a large word list, to solve the problem very quickly.
An important concept is that, in many problems, writing a working program is not
good enough. If the program is to be run on a large data set, then the running time becomes
an issue. Throughout this book we will see how to estimate the running time of a program
for large inputs and, more important, how to compare the running times of two programs
without actually coding them. We will see techniques for drastically improving the speed
of a program and for determining program bottlenecks. These techniques will enable us to
find the section of the code on which to concentrate our optimization efforts.
1.2.1 Exponents
XA XB = XA+B
XA
= XA−B
XB
(XA )B = XAB
XN + XN = 2XN = X2N
2N + 2N = 2N+1
1.2.2 Logarithms
In computer science, all logarithms are to the base 2 unless specified otherwise.
Definition 1.1
XA = B if and only if logX B = A
Several convenient equalities follow from this definition.
Theorem 1.1
logC B
logA B = ; A, B, C > 0, A = 1
logC A
Proof
Let X = logC B, Y = logC A, and Z = logA B. Then, by the definition of loga-
rithms, CX = B, CY = A, and AZ = B. Combining these three equalities yields
B = CX = (CY )Z . Therefore, X = YZ, which implies Z = X/Y, proving the theorem.
Theorem 1.2
Proof
Let X = log A, Y = log B, and Z = log AB. Then, assuming the default base of 2,
2X = A, 2Y = B, and 2Z = AB. Combining the last three equalities yields
2X 2Y = AB = 2Z . Therefore, X + Y = Z, which proves the theorem.
Some other useful formulas, which can all be derived in a similar manner, follow.
log(AB ) = B log A
1.2.3 Series
The easiest formulas to remember are
N
2i = 2N+1 − 1
i=0
and as N tends to ∞, the sum approaches 1/(1 − A). These are the “geometric series”
formulas.
We can derive the last formula for ∞i=0 A (0 < A < 1) in the following manner. Let
i
Another type of common series in analysis is the arithmetic series. Any such series can
be evaluated from the basic formula:
N
N(N + 1) N2
i= ≈
2 2
i=1
N
N(N + 1)(2N + 1) N3
i2 = ≈
6 3
i=1
N
Nk+1
ik ≈ k = −1
|k + 1|
i=1
When k = −1, the latter formula is not valid. We then need the following formula,
which is used far more in computer science than in other mathematical disciplines. The
numbers HN are known as the harmonic numbers, and the sum is known as a harmonic
sum. The error in the following approximation tends to γ ≈ 0.57721566, which is known
as Euler’s constant.
N
1
HN = ≈ loge N
i
i=1
N
f(N) = Nf(N)
i=1
N
N 0 −1
n
f(i) = f(i) − f(i)
i=n0 i=1 i=1
Often, N is a prime number. In that case, there are three important theorems:
There are many theorems that apply to modular arithmetic, and some of them require
extraordinary proofs in number theory. We will use modular arithmetic sparingly, and the
preceding theorems will suffice.
Proof by Induction
A proof by induction has two standard parts. The first step is proving a base case, that is,
establishing that a theorem is true for some small (usually degenerate) value(s); this step is
almost always trivial. Next, an inductive hypothesis is assumed. Generally this means that
the theorem is assumed to be true for all cases up to some limit k. Using this assumption,
the theorem is then shown to be true for the next value, which is typically k + 1. This
proves the theorem (as long as k is finite).
As an example, we prove that the Fibonacci numbers, F0 = 1, F1 = 1, F2 = 2, F3 = 3,
F4 = 5, . . . , Fi = Fi−1 + Fi−2 , satisfy Fi < (5/3)i , for i ≥ 1. (Some definitions have F0 = 0,
which shifts the series.) To do this, we first verify that the theorem is true for the trivial
cases. It is easy to verify that F1 = 1 < 5/3 and F2 = 2 < 25/9; this proves the basis.
We assume that the theorem is true for i = 1, 2, . . . , k; this is the inductive hypothesis. To
prove the theorem, we need to show that Fk+1 < (5/3)k+1 . We have
Fk+1 = Fk + Fk−1
by the definition, and we can use the inductive hypothesis on the right-hand side,
obtaining
CHAPTER I
PRIMITIVE MAN
ebooksecure.com