Algorithm
Algorithm
#algorithm
Table of Contents
About 1
Remarks 2
Introduction to Algorithms 2
Examples 2
Chapter 2: A* Pathfinding 6
Examples 6
Introduction to A* 6
Introduction 16
Examples 16
Introduction 24
Examples 24
Sample Example 24
Remarks 26
Work 27
Span 27
Examples 27
Big-Theta notation 28
Big-Omega Notation 29
Formal definition 29
Notes 29
References 30
Links 31
Introduction 33
Remarks 33
Definitions 33
Examples 33
Fibonacci Numbers 33
Notes 35
Remarks 37
Sources 37
Examples 37
Ticket automat 37
Interval Scheduling 40
Minimizing Lateness 43
Offline Caching 46
Example (FIFO) 47
Example (LFD) 47
FIFO 49
LIFO 50
LRU 52
LFU 53
LFD 54
Algorithm vs Reality 55
Remarks 57
Examples 57
Single Source Shortest Path Algorithm (Given there is a negative cycle in a graph) 57
Remarks 66
Examples 67
A Simple Loop 67
A Nested Loop 68
An O(log n) example 69
Introduction 69
Naïve approach 69
Dichotomy 69
Explanation 70
Conclusion 70
Introduction 73
Examples 73
Introduction 79
Examples 79
Examples 82
Parameters 94
Examples 94
Bubble Sort 94
Implementation in Javascript 95
Implementation in C# 95
Implementation in Java 97
Python Implementation 98
Examples 99
C# Implementation 99
Examples 101
C# Implementation 102
Examples 103
If a given input tree follows Binary search tree property or not 103
Introduction 105
Examples 105
Examples 108
C# Implementation 109
Examples 110
Examples 112
Examples 117
Introduction 122
Remarks 122
Examples 122
Examples 132
Introduction 136
Examples 136
Examples 139
Examples 145
Examples 152
Introduction 155
Remarks 155
Examples 155
Examples 171
Remarks 172
Examples 172
Analysis 178
Examples 181
Examples 182
Links 183
Boolean 183
SByte 183
Char 183
Int16 183
Decimal 184
Object 184
String 184
ValueType 184
Nullable<T> 184
Array 185
References 185
Examples 186
C# Implementation 187
Remarks 188
Examples 189
Algorithm Basics 189
C# Implementation 191
Examples 192
Remarks 195
Examples 195
Introduction 197
Examples 197
KMP-Example 197
Remarks 199
Examples 199
Introduction 202
Examples 202
Examples 206
C# Implementation 213
Introduction 215
Examples 215
Examples 216
Examples 220
C# Implementation 221
Examples 223
C# Implementation 224
Examples 226
Introduction 233
Syntax 233
Examples 233
Examples 235
Remarks 238
Theory 238
Sources 239
Examples 240
Preface 240
Paging 240
Examples 246
C# Implementation 247
Examples 248
Examples 251
C# Implementation 252
Chapter 52: polynomial-time bounded algorithm for Minimum Vertex Cover 254
Introduction 254
Parameters 254
Remarks 254
Examples 254
Examples 256
Remarks 264
Examples 264
Typed 264
No type 264
Functions 264
Remarks 266
Examples 266
C# Implementation 268
Examples 271
Introduction 272
Examples 278
Examples 281
C# Implementation 283
Examples 284
Examples 287
Parameters 290
Examples 290
Remarks 303
Examples 303
Remarks 306
Examples 306
Introduction 306
Credits 310
About
You can share this PDF with anyone you feel could benefit from it, downloaded the latest version
from: algorithm
It is an unofficial and free algorithm ebook created for educational purposes. All the content is
extracted from Stack Overflow Documentation, which is written by many hardworking individuals at
Stack Overflow. It is neither affiliated with Stack Overflow nor official algorithm.
The content is released under Creative Commons BY-SA, and the list of contributors to each
chapter are provided in the credits section at the end of this book. Images may be copyright of
their respective owners unless otherwise specified. All trademarks and registered trademarks are
the property of their respective company owners.
Use the content presented in this book at your own risk; it is not guaranteed to be correct nor
accurate, please send your feedback and corrections to info@zzzprojects.com
https://riptutorial.com/ 1
Chapter 1: Getting started with algorithm
Remarks
Introduction to Algorithms
Algorithms are ubiquitous in Computer Science and Software Engineering. Selection of
appropriate algorithms and data structures improves our program efficiency in cost and time.
A procedure that lacks finiteness but satisfies all other characteristics of an algorithm may be
called a computational method. [Knuth:1997:ACP:260999]
Examples
A sample algorithmic problem
An algorithmic problem is specified by describing the complete set of instances it must work on
and of its output after running on one of these instances. This distinction, between a problem and
an instance of a problem, is fundamental. The algorithmic problem known as sorting is defined as
follows: [Skiena:2008:ADM:1410219]
• Problem: Sorting
https://riptutorial.com/ 2
• Input: A sequence of n keys, a_1, a_2, ..., a_n.
• Output: The reordering of the input sequence such that a'_1 <= a'_2 <= ... <= a'_{n-1} <=
a'_n
For those of you that are new to programming in Swift and those of you coming from different
programming bases, such as Python or Java, this article should be quite helpful. In this post, we
will discuss a simple solution for implementing swift algorithms.
Fizz Buzz
You may have seen Fizz Buzz written as Fizz Buzz, FizzBuzz, or Fizz-Buzz; they're all referring to
the same thing. That "thing" is the main topic of discussion today. First, what is FizzBuzz?
1 2 3 4 5 6 7 8 9 10
Fizz and Buzz refer to any number that's a multiple of 3 and 5 respectively. In other words, if a
number is divisible by 3, it is substituted with fizz; if a number is divisible by 5, it is substituted with
buzz. If a number is simultaneously a multiple of 3 AND 5, the number is replaced with "fizz buzz."
In essence, it emulates the famous children game "fizz buzz".
To work on this problem, open up Xcode to create a new playground and initialize an array like
below:
// for example
let number = [1,2,3,4,5]
// here 3 is fizz and 5 is buzz
To find all the fizz and buzz, we must iterate through the array and check which numbers are fizz
and which are buzz. To do this, create a for loop to iterate through the array we have initialised:
After this, we can simply use the "if else" condition and module operator in swift ie - % to locate
the fizz and buzz
https://riptutorial.com/ 3
print(num)
}
}
Great! You can go to the debug console in Xcode playground to see the output. You will find that
the "fizzes" have been sorted out in your array.
For the Buzz part, we will use the same technique. Let's give it a try before scrolling through the
article — you can check your results against this article once you've finished doing this.
It's rather straight forward — you divided the number by 3, fizz and divided the number by 5, buzz.
Now, increase the numbers in the array
We increased the range of numbers from 1-10 to 1-15 in order to demonstrate the concept of a
"fizz buzz." Since 15 is a multiple of both 3 and 5, the number should be replaced with "fizz buzz."
Try for yourself and check the answer!
Wait...it's not over though! The whole purpose of the algorithm is to customize the runtime
correctly. Imagine if the range increases from 1-15 to 1-100. The compiler will check each number
to determine whether it is divisible by 3 or 5. It would then run through the numbers again to check
if the numbers are divisible by 3 and 5. The code would essentially have to run through each
number in the array twice — it would have to runs the numbers by 3 first and then run it by 5. To
speed up the process, we can simply tell our code to divide the numbers by 15 directly.
https://riptutorial.com/ 4
Here is the final code:
As Simple as that, you can use any language of your choice and get started
Enjoy Coding
https://riptutorial.com/ 5
Chapter 2: A* Pathfinding
Examples
Introduction to A*
A* (A star) is a search algorithm that is used for finding path from one node to another. So it can
be compared with Breadth First Search, or Dijkstra’s algorithm, or Depth First Search, or Best First
Search. A* algorithm is widely used in graph search for being better in efficiency and accuracy,
where graph pre-processing is not an option.
f(n) = g(n) + h(n) is the minimum cost since the initial node to the objectives conditioned to go
thought node n.
A* is an informed search algorithm and it always guarantees to find the smallest path (path with
minimum cost) in the least possible time (if uses admissible heuristic). So it is both complete and
optimal. The following animation demonstrates A* search-
Problem definition:
An 8 puzzle is a simple game consisting of a 3 x 3 grid (containing 9 squares). One of the squares
is empty. The object is to move to squares around into different positions and having the numbers
displayed in the "goal state".
https://riptutorial.com/ 6
Given an initial state of 8-puzzle game and a final state of to be reached, find the most cost-
effective path to reach the final state from initial state.
Initial state:
_ 1 3
4 2 5
7 8 6
Final state:
1 2 3
4 5 6
7 8 _
Heuristic to be assumed:
Let us consider the Manhattan distance between the current and final state as the heuristic for this
problem statement.
h(n) = | x - p | + | y - q |
where x and y are cell co-ordinates in the current state
p and q are cell co-ordinates in the final state
f(n) = g(n) + h(n), where g(n) is the cost required to reach the current state from given
initial state
First we find the heuristic value required to reach the final state from initial state. The cost function,
g(n) = 0, as we are in the initial state
h(n) = 8
https://riptutorial.com/ 7
The above value is obtained, as 1 in the current state is 1 horizontal distance away than the 1 in
final state. Same goes for 2, 5, 6. _ is 2 horizontal distance away and 2 vertical distance away. So
total value for h(n) is 1 + 1 + 1 + 1 + 2 + 2 = 8. Total cost function f(n) is equal to 8 + 0 = 8.
Now, the possible states that can be reached from initial state are found and it happens that we
can either move _ to right or downwards.
1 _ 3 4 1 3
4 2 5 _ 2 5
7 8 6 7 8 6
(1) (2)
Again the total cost function is computed for these states using the method described above and it
turns out to be 6 and 7 respectively. We chose the state with minimum cost which is state (1). The
next possible moves can be Left, Right or Down. We won't move Left as we were previously in
that state. So, we can move Right or Down.
1 3 _ 1 2 3
4 2 5 4 _ 5
7 8 6 7 8 6
(3) (4)
(3) leads to cost function equal to 6 and (4) leads to 4. Also, we will consider (2) obtained before
which has cost function equal to 7. Choosing minimum from them leads to (4). Next possible
moves can be Left or Right or Down. We get states:
1 2 3 1 2 3 1 2 3
_ 4 5 4 5 _ 4 8 5
7 8 6 7 8 6 7 _ 6
(5) (6) (7)
We get costs equal to 5, 2 and 4 for (5), (6) and (7) respectively. Also, we have previous states (3)
and (2) with 6 and 7 respectively. We chose minimum cost state which is (6). Next possible moves
are Up, and Down and clearly Down will lead us to final state leading to heuristic function value
equal to 0.
https://riptutorial.com/ 8
Let's assume that this is a maze. There are no walls/obstacles, though. We only have a starting
point (the green square), and an ending point (the red square). Let's also assume that in order to
get from green to red, we cannot move diagonally. So, starting from the green square, let's see
which squares we can move to, and highlight them in blue:
https://riptutorial.com/ 9
In order to choose which square to move to next, we need to take into account 2 heuristics:
1. The "g" value - This is how far away this node is from the green square.
2. The "h" value - This is how far away this node is from the red square.
3. The "f" value - This is the sum of the "g" value and the "h" value. This is the final number
which tells us which node to move to.
In order to calculate these heuristics, this is the formula we will use: distance = abs(from.x - to.x)
+ abs(from.y - to.y)
Let's calculate the "g" value for the blue square immediately to the left of the green square: abs(3 -
2) + abs(2 - 2) = 1
Great! We've got the value: 1. Now, let's try calculating the "h" value: abs(2 - 0) + abs(2 - 0) = 4
Let's do the same for all the other blue squares. The big number in the center of each square is
the "f" value, while the number on the top left is the "g" value, and the number on the top right is
the "h" value:
https://riptutorial.com/ 10
We've calculated the g, h, and f values for all of the blue nodes. Now, which do we pick?
However, in this case, we have 2 nodes with the same f value, 5. How do we pick between them?
Simply, either choose one at random, or have a priority set. I usually prefer to have a priority like
so: "Right > Up > Down > Left"
One of the nodes with the f value of 5 takes us in the "Down" direction, and the other takes us
"Left". Since Down is at a higher priority than Left, we choose the square which takes us "Down".
I now mark the nodes which we calculated the heuristics for, but did not move to, as orange, and
the node which we chose as cyan:
https://riptutorial.com/ 11
Alright, now let's calculate the same heuristics for the nodes around the cyan node:
Again, we choose the node going down from the cyan node, as all the options have the same f
value:
https://riptutorial.com/ 12
Let's calculate the heuristics for the only neighbour that the cyan node has:
Alright, since we will follow the same pattern we have been following:
https://riptutorial.com/ 13
Once more, let's calculate the heuristics for the node's neighbour:
https://riptutorial.com/ 14
Finally, we can see that we have a winning square beside us, so we move there, and we are done.
https://riptutorial.com/ 15
Chapter 3: A* Pathfinding Algorithm
Introduction
This topic is going to focus on the A* Pathfinding algorithm, how it's used, and why it works.
Note to future contributors: I have added an example for A* Pathfinding without any obstacles, on
a 4x4 grid. An example with obstacles is still needed.
Examples
Simple Example of A* Pathfinding: A maze with no obstacles
https://riptutorial.com/ 16
Let's assume that this is a maze. There are no walls/obstacles, though. We only have a starting
point (the green square), and an ending point (the red square). Let's also assume that in order to
get from green to red, we cannot move diagonally. So, starting from the green square, let's see
which squares we can move to, and highlight them in blue:
https://riptutorial.com/ 17
In order to choose which square to move to next, we need to take into account 2 heuristics:
1. The "g" value - This is how far away this node is from the green square.
2. The "h" value - This is how far away this node is from the red square.
3. The "f" value - This is the sum of the "g" value and the "h" value. This is the final number
which tells us which node to move to.
In order to calculate these heuristics, this is the formula we will use: distance = abs(from.x - to.x)
+ abs(from.y - to.y)
Let's calculate the "g" value for the blue square immediately to the left of the green square: abs(3 -
2) + abs(2 - 2) = 1
Great! We've got the value: 1. Now, let's try calculating the "h" value: abs(2 - 0) + abs(2 - 0) = 4
Let's do the same for all the other blue squares. The big number in the center of each square is
the "f" value, while the number on the top left is the "g" value, and the number on the top right is
the "h" value:
https://riptutorial.com/ 18
We've calculated the g, h, and f values for all of the blue nodes. Now, which do we pick?
However, in this case, we have 2 nodes with the same f value, 5. How do we pick between them?
Simply, either choose one at random, or have a priority set. I usually prefer to have a priority like
so: "Right > Up > Down > Left"
One of the nodes with the f value of 5 takes us in the "Down" direction, and the other takes us
"Left". Since Down is at a higher priority than Left, we choose the square which takes us "Down".
I now mark the nodes which we calculated the heuristics for, but did not move to, as orange, and
the node which we chose as cyan:
https://riptutorial.com/ 19
Alright, now let's calculate the same heuristics for the nodes around the cyan node:
Again, we choose the node going down from the cyan node, as all the options have the same f
value:
https://riptutorial.com/ 20
Let's calculate the heuristics for the only neighbour that the cyan node has:
Alright, since we will follow the same pattern we have been following:
https://riptutorial.com/ 21
Once more, let's calculate the heuristics for the node's neighbour:
https://riptutorial.com/ 22
Finally, we can see that we have a winning square beside us, so we move there, and we are done.
https://riptutorial.com/ 23
Chapter 4: Algo:- Print a m*n matrix in square
wise
Introduction
Check sample input and output below.
Examples
Sample Example
Input :-
14 15 16 17 18 21
19 10 20 11 54 36
64 55 44 23 80 39
91 92 93 94 95 42
Output:-
print value in index
14 15 16 17 18 21 36 39 42 95 94 93 92 91 64 19 10 20 11 54 80 23 44 55
or print index
00 01 02 03 04 05 15 25 35 34 33 32 31 30 20 10 11 12 13 14 24 23 22 21
function noOfLooping(m,n) {
if(m > n) {
smallestValue = n;
} else {
smallestValue = m;
}
if(smallestValue % 2 == 0) {
return smallestValue/2;
} else {
return (smallestValue+1)/2;
}
}
function squarePrint(m,n) {
var looping = noOfLooping(m,n);
for(var i = 0; i < looping; i++) {
for(var j = i; j < m - 1 - i; j++) {
console.log(i+''+j);
}
for(var k = i; k < n - 1 - i; k++) {
console.log(k+''+j);
}
for(var l = j; l > i; l--) {
https://riptutorial.com/ 24
console.log(k+''+l);
}
for(var x = k; x > i; x--) {
console.log(x+''+l);
}
}
}
squarePrint(6,4);
https://riptutorial.com/ 25
Chapter 5: Algorithm Complexity
Remarks
All algorithms are a list of steps to solve a problem. Each step has dependencies on some set of
previous steps, or the start of the algorithm. A small problem might look like the following:
This structure is called a directed acyclic graph, or DAG for short. The links between each node in
the graph represent dependencies in the order of operations, and there are no cycles in the graph.
total = 0
for(i = 1; i < 10; i++)
total = total + i
In this psuedocode, each iteration of the for loop is dependent on the result from the previous
iteration because we are using the value calculated in the previous iteration in this next iteration.
The DAG for this code might look like this:
https://riptutorial.com/ 26
If you understand this representation of algorithms, you can use it to understand algorithm
complexity in terms of work and span.
Work
Work is the actual number of operations that need to be executed in order to achieve the goal of
the algorithm for a given input size n.
Span
Span is sometimes referred to as the critical path, and is the fewest number of steps an algorithm
must make in order to achieve the goal of the algorithm.
The following image highlights the graph to show the differences between work and span on our
sample DAG.
The work is the number of nodes in the graph as a whole. This is represented by the graph on the
left above. The span is the critical path, or longest path from the start to the end. When work can
be done in parallel, the yellow highlighted nodes on the right represent span, the fewest number of
steps required. When work must be done serially, the span is the same as the work.
Both work and span can be evaluated independently in terms of analysis. The speed of an
algorithm is determined by the span. The amount of computational power required is determined
by the work.
Examples
https://riptutorial.com/ 27
Big-Theta notation
Unlike Big-O notation, which represents only upper bound of the running time for some algorithm,
Big-Theta is a tight bound; both upper and lower bound. Tight bound is more precise, but also
more difficult to compute.
An intuitive way to grasp it is that f(x) = Ө(g(x)) means that the graphs of f(x) and g(x) grow in the
same rate, or that the graphs 'behave' similarly for big enough values of x.
An example
If the algorithm for the input n takes 42n^2 + 25n + 4 operations to finish, we say that is O(n^2), but
is also O(n^3) and O(n^100). However, it is Ө(n^2) and it is not Ө(n^3), Ө(n^4) etc. Algorithm that is Ө(
f(n)) is also O(f(n)), but not vice versa!
Ө(g(x)) = {f(x) such that there exist positive constants c1, c2, N such that 0 <= c1*g(x) <=
f(x) <= c2*g(x) for all x > N}
Because Ө(g(x)) is a set, we could write f(x) ∈ Ө(g(x)) to indicate that f(x) is a member of Ө(g(x))
. Instead, we will usually write f(x) = Ө(g(x)) to express the same notion - that's the common way.
Whenever Ө(g(x)) appears in a formula, we interpret it as standing for some anonymous function
that we do not care to name. For example the equation T(n) = T(n/2) + Ө(n), means T(n) = T(n/2)
+ f(n) where f(n) is a function in the set Ө(n).
Let f and g be two functions defined on some subset of the real numbers. We write f(x) = Ө(g(x))
as x->infinity if and only if there are positive constants K and L and a real number x0 such that
holds:
if limit(x->infinity) f(x)/g(x) = c ∈ (0,∞) i.e. the limit exists and it's positive, then f(x) = Ө(g(x))
https://riptutorial.com/ 28
Name Notation n = 10 n = 100
Constant Ө(1) 1 1
Logarithmic Ө(log(n)) 3 7
Big-Omega Notation
Formal definition
Let f(n) and g(n) be two functions defined on the set of the positive real numbers. We write f(n) =
Ω(g(n)) if there are positive constants c and n0 such that:
Notes
f(n) = Ω(g(n))means that f(n) grows asymptotically no slower than g(n). Also we can say about
Ω(g(n)) when algorithm analysis is not enough for statement about Θ(g(n)) or / and O(g(n)).
For two any functions f(n) and g(n) we have f(n) = Ө(g(n)) if and only if f(n) = O(g(n)) and f(n) =
Ω(g(n)).
https://riptutorial.com/ 29
For example lets we have f(n) = 3n^2 + 5n - 4. Then f(n) = Ω(n^2). It is also correct f(n) = Ω(n),
or even f(n) = Ω(1).
Another example to solve perfect matching algorithm : If the number of vertices is odd then output
"No Perfect Matching" otherwise try all possible matchings.
We would like to say the algorithm requires exponential time but in fact you cannot prove a Ω(n^2)
lower bound using the usual definition of Ω since the algorithm runs in linear time for n odd. We
should instead define f(n)=Ω(g(n)) by saying for some constant c>0, f(n)≥ c g(n) for infinitely many
n. This gives a nice correspondence between upper and lower bounds: f(n)=Ω(g(n)) iff f(n) !=
o(g(n)).
References
Formal definition and theorem are taken from the book "Thomas H. Cormen, Charles E.
Leiserson, Ronald L. Rivest, Clifford Stein. Introduction to Algorithms".
Let f(n) and g(n) be two functions defined on the set of the positive real numbers, c, c1, c2, n0
are positive real constants.
Formal
∃ c > 0, ∃ n0 > 0 : ∀ n ≥ n0, 0 ≤ f(n) ≤ c g(n) ∃ c > 0, ∃ n0 > 0 : ∀ n ≥ n0, 0 ≤ c g
definition
https://riptutorial.com/ 30
Notation f(n) = O(g(n)) f(n) = Ω(g(n))
Analogy
between the
asymptotic
comparison a ≤ b a ≥ b
of f, g and
real numbers
a, b
Graphic
interpretation
Links
Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein. Introduction to
Algorithms.
https://riptutorial.com/ 31
Read Algorithm Complexity online: https://riptutorial.com/algorithm/topic/1529/algorithm-
complexity
https://riptutorial.com/ 32
Chapter 6: Applications of Dynamic
Programming
Introduction
The basic idea behind dynamic programming is breaking a complex problem down to several
small and simple problems that are repeated. If you can identify a simple subproblem that is
repeatedly calculated, odds are there is a dynamic programming approach to the problem.
As this topic is titled Applications of Dynamic Programming, it will focus more on applications
rather than the process of creating dynamic programming algorithms.
Remarks
Definitions
Memoization - an optimization technique used primarily to speed up computer programs by storing
the results of expensive function calls and returning the cached result when the same inputs occur
again.
Dynamic Programming - a method for solving a complex problem by breaking it down into a
collection of simpler subproblems, solving each of those subproblems just once, and storing their
solutions.
Examples
Fibonacci Numbers
Fibonacci Numbers are a prime subject for dynamic programming as the traditional recursive
approach makes a lot of repeated calculations. In these examples I will be using the base case of
f(0) = f(1) = 1.
Here is an example recursive tree for fibonacci(4), note the repeated computations:
https://riptutorial.com/ 33
Non-Dynamic Programming O(2^n) Runtime Complexity, O(n) Stack complexity
def fibonacci(n):
if n < 2:
return 1
return fibonacci(n-1) + fibonacci(n-2)
This is the most intuitive way to write the problem. At most the stack space will be O(n) as you
descend the first recursive branch making calls to fibonacci(n-1) until you hit the base case n < 2.
The O(2^n) runtime complexity proof that can be seen here: Computational complexity of Fibonacci
Sequence. The main point to note is that the runtime is exponential, which means the runtime for
this will double for every subsequent term, fibonacci(15) will take twice as long as fibonacci(14).
Memoized O(n) Runtime Complexity, O(n) Space complexity, O(n) Stack complexity
memo = []
memo.append(1) # f(1) = 1
memo.append(1) # f(2) = 1
def fibonacci(n):
if len(memo) > n:
return memo[n]
With the memoized approach we introduce an array that can be thought of as all the previous
function calls. The location memo[n] is the result of the function call fibonacci(n). This allows us to
trade space complexity of O(n) for a O(n) runtime as we no longer need to compute duplicate
function calls.
Iterative Dynamic Programming O(n) Runtime complexity, O(n) Space complexity, No recursive
https://riptutorial.com/ 34
stack
def fibonacci(n):
memo = [1,1] # f(0) = 1, f(1) = 1
return memo[n]
If we break the problem down into it's core elements you will notice that in order to compute
fibonacci(n) we need fibonacci(n-1) and fibonacci(n-2). Also we can notice that our base case
will appear at the end of that recursive tree as seen above.
With this information, it now makes sense to compute the solution backwards, starting at the base
cases and working upwards. Now in order to calculate fibonacci(n) we first calculate all the
fibonacci numbers up to and through n.
This main benefit here is that we now have eliminated the recursive stack while keeping the O(n)
runtime. Unfortunately, we still have an O(n) space complexity but that can be changed as well.
Advanced Iterative Dynamic Programming O(n) Runtime complexity, O(1) Space complexity, No
recursive stack
def fibonacci(n):
memo = [1,1] # f(1) = 1, f(2) = 1
return memo[n%2]
As noted above, the iterative dynamic programming approach starts from the base cases and
works to the end result. The key observation to make in order to get to the space complexity to
O(1) (constant) is the same observation we made for the recursive stack - we only need
fibonacci(n-1) and fibonacci(n-2) to build fibonacci(n). This means that we only need to save the
results for fibonacci(n-1) and fibonacci(n-2) at any point in our iteration.
To store these last 2 results I use an array of size 2 and simply flip which index I am assigning to
by using i % 2 which will alternate like so: 0, 1, 0, 1, 0, 1, ..., i % 2.
I add both indexes of the array together because we know that addition is commutative (5 + 6 = 11
and 6 + 5 == 11). The result is then assigned to the older of the two spots (denoted by i % 2). The
final result is then stored at the position n%2
Notes
• It is important to note that sometimes it may be best to come up with a iterative memoized
https://riptutorial.com/ 35
solution for functions that perform large calculations repeatedly as you will build up a cache
of the answer to the function calls and subsequent calls may be O(1) if it has already been
computed.
https://riptutorial.com/ 36
Chapter 7: Applications of Greedy technique
Remarks
Sources
1. The examples above are from lecture notes frome a lecture which was taught 2008 in Bonn,
Germany. They in term are based on the book Algorithm Design by Jon Kleinberg and Eva
Tardos:
Examples
Ticket automat
You have a ticket automat which gives exchange in coins with values 1, 2, 5, 10 and 20. The
dispension of the exchange can be seen as a series of coin drops until the right value is
dispensed. We say a dispension is optimal when its coin count is minimal for its value.
Let M in [1,50] be the price for the ticket T and P in [1,50] the money somebody paid for T, with P >=
M. Let D=P-M. We define the benefit of a step as the difference between D and D-c with c the coin the
automat dispense in this step.
The Greedy Technique for the exchange is the following pseudo algorithmic approach:
Afterwards the sum of all coins clearly equals D. Its a greedy algorithm because after each step
and after each repitition of a step the benefit is maximized. We cannot dispense another coin with
a higher benefit.
#include <iostream>
#include <vector>
#include <string>
#include <algorithm>
https://riptutorial.com/ 37
int main()
{
std::vector<unsigned int> coinValues; // Array of coin values ascending
int ticketPrice; // M in example
int paidMoney; // P in example
coinCount.push_back(countCoins);
}
return 0;
}
https://riptutorial.com/ 38
// make sure 1 is in vectore
coinValues.push_back(1);
if(coinValue > 0)
coinValues.push_back(coinValue);
else
break;
}
// sort values
sort(coinValues.begin(), coinValues.end(), std::greater<int>());
// print array
cout << "Coin values: ";
for(auto i : coinValues)
cout << i << " ";
return coinValues;
}
Be aware there is now input checking to keep the example simple. One example output:
As long as 1 is in the coin values we now, that the algorithm will terminate, because:
https://riptutorial.com/ 39
But the algorithm has two pitfalls:
1. Let C be the biggest coin value. The runtime is only polynomial as long as D/C is polynomial,
because the representation of D uses only log D bits and the runtime is at least linear in D/C.
2. In every step our algorithm chooses the local optimum. But this is not sufficient to say that
the algorithm finds the global optimal solution (see more informations here or in the Book of
Korte and Vygen).
A simple counter example: the coins are 1,3,4 and D=6. The optimal solution is clearly two coins of
value 3 but greedy chooses 4 in the first step so it has to choose 1 in step two and three. So it
gives no optimal soution. A possible optimal Algorithm for this example is based on dynamic
programming.
Interval Scheduling
We have a set of jobs J={a,b,c,d,e,f,g}. Let j in J be a job than its start at sj and ends at fj. Two
jobs are compatible if they don't overlap. A picture as example: The goal is to find the maximum
subset of mutually compatible jobs. There are several greedy approaches for this problem:
The question now is, which approach is really successfull. Early start time definetly not, here is a
counter example Shortest interval is not optimal either and fewest conflicts may indeed sound
optimal, but here is a problem case for this approach: Which leaves us with earliest finish time.
The pseudo code is quiet simple:
Or as C++ program:
#include <iostream>
#include <utility>
#include <tuple>
#include <vector>
#include <algorithm>
https://riptutorial.com/ 40
int main()
{
vector<pair<int,int>> jobs;
// step 1: sort
sort(jobs.begin(), jobs.end(),[](pair<int,int> p1, pair<int,int> p2)
{ return p1.second < p2.second; });
// step 3:
for(int i=0; i<jobCnt; ++i)
{
auto job = jobs[i];
bool isCompatible = true;
for(auto jobIndex : A)
{
// test whether the actual job and the job from A are incompatible
if(job.second >= jobs[jobIndex].first &&
job.first <= jobs[jobIndex].second)
{
isCompatible = false;
break;
}
}
if(isCompatible)
A.push_back(i);
}
//step 4: print A
cout << "Compatible: ";
for(auto i : A)
cout << "(" << jobs[i].first << "," << jobs[i].second << ") ";
cout << endl;
return 0;
}
The output for this example is: Compatible: (1,3) (4,5) (6,8) (9,10)
The implementation of the algorithm is clearly in Θ(n^2). There is a Θ(n log n) implementation and
the interested reader may continue reading below (Java Example).
Now we have a greedy algorithm for the interval scheduling problem, but is it optimal?
Proof:(by contradiction)
Assume greedy is not optimal and i1,i2,...,ik denote the set of jobs selected by greedy. Let
j1,j2,...,jm denote the set of jobs in an optimal solution with i1=j1,i2=j2,...,ir=jr for the
https://riptutorial.com/ 41
largest possible value of r.
The job i(r+1) exists and finishes before j(r+1) (earliest finish). But than is
j1,j2,...,jr,i(r+1),j(r+2),...,jm also a optimal solution and for all k in [1,(r+1)] is jk=ik. thats a
contradiction to the maximality of r. This concludes the proof.
This second example demonstrates that there are usually many possible greedy strategies but
only some or even none might find the optimal solution in every instance.
import java.util.Arrays;
import java.util.Comparator;
class Job
{
int start, finish, profit;
return -1;
}
https://riptutorial.com/ 42
{
Arrays.sort(jobs, new JobComparator());
int n = jobs.length;
int table[] = new int[n];
table[0] = jobs[0].profit;
return table[n-1];
}
Minimizing Lateness
There are numerous problems minimizing lateness, here we have a single resource which can
only process one job at a time. Job j requires tj units of processing time and is due at time dj. if j
starts at time sj it will finish at time fj=sj+tj. We define lateness L=max{0,fj-dh} for all j. The goal
is to minimize the maximum lateness L.
1 2 3 4 5 6
tj 3 2 1 4 3 2
dj 6 8 9 9 10 11
Job 3 2 2 5 5 5 4 4 4 4 1 1 1 6 6
Time 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Lj -8 -5 -4 1 7 4
The solution L=7 is obviously not optimal. Lets look at some greedy strategies:
https://riptutorial.com/ 43
1. Shortest processing time first: schedule jobs in ascending order og processing time j`
2. Earliest deadline first: Schedule jobs in ascending order of deadline dj
3. Smallest slack: schedule jobs in ascending order of slack dj-tj
Its easy to see that shortest processing time first is not optimal a good counter example is
1 2
tj 1 5
dj 10 5
1 2
tj 1 5
dj 3 5
the last strategy looks valid so we start with some pseudo code:
#include <iostream>
#include <utility>
#include <tuple>
#include <vector>
#include <algorithm>
int main()
{
vector<pair<int,int>> jobs;
https://riptutorial.com/ 44
for(int i=0; i<jobCnt; ++i)
jobs.push_back(make_pair(processTimes[i], dueTimes[i]));
// step 1: sort
sort(jobs.begin(), jobs.end(),[](pair<int,int> p1, pair<int,int> p2)
{ return p1.second < p2.second; });
// step 3:
vector<pair<int,int>> jobIntervals;
int lateness = 0;
cout << "(" << pair.first << "," << pair.second << ") "
<< "Lateness: " << pair.second-jobs[i].second << std::endl;
}
return 0;
}
Intervals:
(0,2) Lateness:-2
(2,5) Lateness:-2
(5,8) Lateness: 0
(8,9) Lateness: 0
(9,12) Lateness: 3
(12,17) Lateness: 6
(17,21) Lateness: 8
(21,23) Lateness: 6
(23,25) Lateness: 3
(25,26) Lateness: 1
maximal lateness is 8
The runtime of the algorithm is obviously Θ(n log n) because sorting is the dominating operation of
this algorithm. Now we need to show that it is optimal. Clearly an optimal schedule has no idle
https://riptutorial.com/ 45
time. the earliest deadline first schedule has also no idle time.
Lets assume the jobs are numbered so that d1<=d2<=...<=dn. We say a inversion of a schedule is
a pair of jobs i and j so that i<j but j is scheduled before i. Due to its definition the earliest
deadline first schedule has no inversions. Of course if a schedule has an inversion it has one with
a pair of inverted jobs scheduled consecutively.
Proposition: Swapping two adjacent, inverted jobs reduces the number of inversions by one and
does not increase the maximal lateness.
Proof: Let L be the lateness before the swap and M the lateness afterwards. Because exchanging
two adjacent jobs does not move the other jobs from their position it is Lk=Mk for all k != i,j.
Clearly it is Mi<=Li since job i got scheduled earlier. if job j is late, so follows from the definition:
Mj = fi-dj (definition)
<= fi-di (since i and j are exchanged)
<= Li
That means the lateness after swap is less or equal than before. This concludes the proof.
Proof:(by contradiction)
Lets assume S* is optimal schedule with the fewest possible number of inversions. we can
assume that S* has no idle time. If S* has no inversions, then S=S* and we are done. If S* has an
inversion, than it has an adjacent inversion. The last Proposition states that we can swap the
adjacent inversion without increasing lateness but with decreasing the number of inversions. This
contradicts the definition of S*.
The minimizing lateness problem and its near related minimum makespan problem, where the
question for a minimal schedule is asked have lots of applications in the real world. But usually
you don't have only one machine but many and they handle the same task at different rates.
These problems get NP-complete really fast.
Another interesting question arises if we don't look at the offline problem, where we have all tasks
and data at hand but at the online variant, where tasks appear during execution.
Offline Caching
The caching problem arises from the limitation of finite space. Lets assume our cache C has k
pages. Now we want to process a sequence of m item requests which must have been placed in
the cache before they are processed.Of course if m<=k then we just put all elements in the cache
and it will work, but usually is m>>k.
We say a request is a cache hit, when the item is already in cache, otherwise its called a cache
miss. In that case we must bring the requested item into cache and evict another, assuming the
https://riptutorial.com/ 46
cache is full. The Goal is a eviction schedule that minimizes the number of evictions.
There are numerous greedy strategies for this problem, lets look at some:
1. First in, first out (FIFO): The oldest page gets evicted
2. Last in, first out (LIFO): The newest page gets evicted
3. Last recent out (LRU): Evict page whose most recent access was earliest
4. Least frequently requested(LFU): Evict page that was least frequently requested
5. Longest forward distance (LFD): Evict page in the cache that is not requested until farthest
in the future.
Attention: For the following examples we evict the page with the smallest index, if more than one
page could be evicted.
Example (FIFO)
Let the cache size be k=3 the initial cache a,b,c and the request a,a,d,e,b,b,a,c,f,d,e,a,f,b,e,c:
Request a a d e b b a c f d e a f b e c
cache 1 a a d d d d a a a d d d f f f c
cache 2 b b b e e e e c c c e e e b b b
cache 3 c c c c b b b b f f f a a a e e
cache miss x x x x x x x x x x x x x
Thirteen cache misses by sixteen requests does not sound very optimal, lets try the same
example with another strategy:
Example (LFD)
Let the cache size be k=3 the initial cache a,b,c and the request a,a,d,e,b,b,a,c,f,d,e,a,f,b,e,c:
Request a a d e b b a c f d e a f b e c
cache 1 a a d e e e e e e e e e e e e c
cache 2 b b b b b b a a a a a a f f f f
cache 3 c c c c c c c c f d d d d b b b
cache miss x x x x x x x x
Selftest: Do the example for LIFO, LFU, RFU and look what happend.
https://riptutorial.com/ 47
The following example programm (written in C++) consists of two parts:
The skeleton is a application, which solves the problem dependent on the chosen greedy strategy:
#include <iostream>
#include <memory>
// for reset
char originalCache[] = {'a','b','c'};
class Strategy {
public:
Strategy(std::string name) : strategyName(name) {}
virtual ~Strategy() = default;
// write to cache
cache[cachePlace] = request[requestIndex];
return isMiss;
}
int main()
{
Strategy* selectedStrategy[] = { new FIFO, new LIFO, new LRU, new LFU, new LFD };
https://riptutorial.com/ 48
cout <<"\nStrategy: " << selectedStrategy[strat]->strategyName << endl;
int cntMisses = 0;
The basic idea is simple: for every request I have two calls two my strategy:
1. apply: The strategy has to tell the caller which page to use
2. update: After the caller uses the place, it tells the strategy whether it was a miss or not.
Then the strategy may update its internal data. The strategy LFU for example has to update
the hit frequency for the cache pages, while the LFD strategy has to recalculate the
distances for the cache pages.
FIFO
https://riptutorial.com/ 49
oldest = i;
}
return oldest;
}
else
age[i] = 0;
}
}
private:
int age[cacheSize];
};
FIFO just needs the information how long a page is in the cache (and of course only relative to the
other pages). So the only thing to do is wait for a miss and then make the pages, which where not
evicted older. For our example above the program solution is:
Strategy: FIFO
LIFO
https://riptutorial.com/ 50
class LIFO : public Strategy {
public:
LIFO() : Strategy("LIFO")
{
for (int i=0; i<cacheSize; ++i) age[i] = 0;
}
return newest;
}
else
age[i] = 0;
}
}
private:
int age[cacheSize];
};
The implementation of LIFO is more or less the same as by FIFO but we evict the youngest not
the oldest page. The program results are:
Strategy: LIFO
https://riptutorial.com/ 51
f f b c x
d d b c x
e e b c x
a a b c x
f f b c x
b f b c
e e b c x
c e b c
LRU
return oldest;
}
else
age[i] = 0;
}
}
private:
int age[cacheSize];
};
In case of LRU the strategy is independent from what is at the cache page, its only interest is the
last usage. The programm results are:
Strategy: LRU
https://riptutorial.com/ 52
Cache initial: (a,b,c)
LFU
return least;
}
else
++requestFrequency[cachePos];
}
private:
https://riptutorial.com/ 53
// how frequently was the page used
int requestFrequency[cacheSize];
};
LFU evicts the page uses least often. So the update strategy is just to count every access. Of
course after a miss the count resets. The program results are:
Strategy: LFU
LFD
return latest;
}
https://riptutorial.com/ 54
{
nextUse[cachePos] = calcNextUse(requestIndex, cache[cachePos]);
}
private:
return requestLength + 1;
}
The LFD strategy is different from everyone before. Its the only strategy that uses the future
requests for its decission who to evict. The implementation uses the function calcNextUse to get the
page which next use is farthest away in the future. The program solution is equal to the solution by
hand from above:
Strategy: LFD
The greedy strategy LFD is indeed the only optimal strategy of the five presented. The proof is
rather long and can be found here or in the book by Jon Kleinberg and Eva Tardos (see sources in
remarks down below).
Algorithm vs Reality
https://riptutorial.com/ 55
The LFD strategy is optimal, but there is a big problem. Its an optimal offline solution. In praxis
caching is usually an online problem, that means the strategy is useless because we cannot now
the next time we need a particular item. The other four strategies are also online strategies. For
online problems we need a general different approach.
https://riptutorial.com/ 56
Chapter 8: Bellman–Ford Algorithm
Remarks
Given a directed graph G, we often want to find the shortest distance from a given node A to rest of
the nodes in the graph. Dijkstra algorithm is the most famous algorithm for finding the shortest
path, however it works only if edge weights of the given graph are non-negative. Bellman-Ford
however aims to find the shortest path from a given node (if one exists) even if some of the
weights are negative. Note that, shortest distance may not exist if a negative cycle is present in
the graph (in which case we can go around the cycle resulting in infinitely small total distance ).
Bellman-Ford additionally allows us to determine the presence of such a cycle.
Total complexity of the algorithm is O(V*E), where V - is the number of vertices and E number of
edges
Examples
Single Source Shortest Path Algorithm (Given there is a negative cycle in a
graph)
Before reading this example, it is required to have a brief idea on edge-relaxation. You can learn it
from here
Bellman-Ford Algorithm is computes the shortest paths from a single source vertex to all of the
other vertices in a weighted digraph. Even though it is slower than Dijkstra's Algorithm, it works in
the cases when the weight of the edge is negative and it also finds negative weight cycle in the
graph. The problem with Dijkstra's Algorithm is, if there's a negative cycle, you keep going through
the cycle again and again and keep reducing the distance between two vertices.
The idea of this algorithm is to go through all the edges of this graph one-by-one in some random
order. It can be any random order. But you must ensure, if u-v (where u and v are two vertices in
a graph) is one of your orders, then there must be an edge from u to v. Usually it is taken directly
from the order of the input given. Again, any random order will work.
After selecting the order, we will relax the edges according to the relaxation formula. For a given
edge u-v going from u to v the relaxation formula is:
That is, if the distance from source to any vertex u + the weight of the edge u-v is less than the
distance from source to another vertex v, we update the distance from source to v. We need to
relax the edges at most (V-1) times where V is the number of edges in the graph. Why (V-1) you
ask? We'll explain it in another example. Also we are going to keep track of the parent vertex of
any vertex, that is when we relax an edge, we will set:
https://riptutorial.com/ 57
parent[v] = u
It means we've found another shorter path to reach v via u. We will need this later to print the
shortest path from source to the destined vertex.
We have selected 1 as the source vertex. We want to find out the shortest path from the source
to all other vertices.
At first, d[1] = 0 because it is the source. And rest are infinity, because we don't know their
distance yet.
+--------+--------+--------+--------+--------+--------+--------+
| Serial | 1 | 2 | 3 | 4 | 5 | 6 |
+--------+--------+--------+--------+--------+--------+--------+
| Edge | 4->5 | 3->4 | 1->3 | 1->4 | 4->6 | 2->3 |
+--------+--------+--------+--------+--------+--------+--------+
You can take any sequence you want. If we relax the edges once, what do we get? We get the
distance from source to all other vertices of the path that uses at most 1 edge. Now let's relax the
edges and update the values of d[]. We get:
We couldn't update some vertices, because the d[u] + cost[u][v] < d[v] condition didn't match.
https://riptutorial.com/ 58
As we have said before, we found the paths from source to other nodes using maximum 1 edge.
Our second iteration will provide us with the path using 2 nodes. We get:
Our 3rd iteration will only update vertex 5, where d[5] will be 8. Our graph will look like:
https://riptutorial.com/ 59
After this no matter how many iterations we do, we'll have the same distances. So we will keep a
flag that checks if any update takes place or not. If it doesn't, we'll simply break the loop. Our
pseudo-code will be:
To keep track of negative cycle, we can modify our code using the procedure described here. Our
completed pseudo-code will be:
https://riptutorial.com/ 60
for all edges from (u,v) in Graph
if d[u] + cost[u][v] < d[v]
d[v] := d[u] + cost[u][v]
parent[v] := u
flag := true
end if
end for
if flag == false
break
end for
for all edges from (u,v) in Graph
if d[u] + cost[u][v] < d[v]
Return "Negative Cycle Detected"
end if
end for
Return d
Printing Path:
To print the shortest path to a vertex, we'll iterate back to its parent until we find NULL and then
print the vertices. The pseudo-code will be:
Procedure PathPrinting(u)
v := parent[u]
if v == NULL
return
PathPrinting(v)
print -> u
Complexity:
Since we need to relax the edges maximum (V-1) times, the time complexity of this algorithm will
be equal to O(V * E) where E denotes the number of edges, if we use adjacency list to represent
the graph. However, if adjacency matrix is used to represent the graph, time complexity will be
O(V^3). Reason is we can iterate through all edges in O(E) time when adjacency list is used, but
it takes O(V^2) time when adjacency matrix is used.
In Bellman-Ford algorithm, to find out the shortest path, we need to relax all the edges of the
graph. This process is repeated at most (V-1) times, where V is the number of vertices in the
graph.
The number of iterations needed to find out the shortest path from source to all other vertices
depends on the order that we select to relax the edges.
https://riptutorial.com/ 61
Here, the source vertex is 1. We will find out the shortest distance between the source and all the
other vertices. We can clearly see that, to reach vertex 4, in the worst case, it'll take (V-1) edges.
Now depending on the order in which the edges are discovered, it might take (V-1) times to
discover vertex 4. Didn't get it? Let's use Bellman-Ford algorithm to find out the shortest path
here:
+--------+--------+--------+--------+
| Serial | 1 | 2 | 3 |
+--------+--------+--------+--------+
| Edge | 3->4 | 2->3 | 1->2 |
+--------+--------+--------+--------+
We can see that our relaxation process only changed d[2]. Our graph will look like:
Second iteration:
This time the relaxation process changed d[3]. Our graph will look like:
Third iteration:
https://riptutorial.com/ 62
1. d[3] + cost[3][4] = 7 < d[4]. d[4] = 7. parent[4] = 3.
2. It won't be changed.
3. It won't be changed.
Our third iteration finally found out the shortest path to 4 from 1. Our graph will look like:
So, it took 3 iterations to find out the shortest path. After this one, no matter how many times we
relax the edges, the values in d[] will remain the same. Now, if we considered another sequence:
+--------+--------+--------+--------+
| Serial | 1 | 2 | 3 |
+--------+--------+--------+--------+
| Edge | 1->2 | 2->3 | 3->4 |
+--------+--------+--------+--------+
We'd get:
Our very first iteration has found the shortest path from source to all the other nodes. Another
sequence 1->2, 3->4, 2->3 is possible, which will give us shortest path after 2 iterations. We can
come to the decision that, no matter how we arrange the sequence, it won't take more than 3
iterations to find out shortest path from the source in this example.
We can conclude that, for the best case, it'll take 1 iteration to find out the shortest path from
source. For the worst case, it'll take (V-1) iterations, which is why we repeat the process of
relaxation (V-1) times.
To understand this example, it is recommended to have a brief idea about Bellman-Ford algorithm
which can be found here
Using Bellman-Ford algorithm, we can detect if there is a negative cycle in our graph. We know
that, to find out the shortest path, we need to relax all the edges of the graph (V-1) times, where V
is the number of vertices in a graph. We have already seen that in this example, after (V-1)
iterations, we can't update d[], no matter how many iterations we do. Or can we?
If there is a negative cycle in a graph, even after (V-1) iterations, we can update d[]. This happens
because for every iteration, traversing through the negative cycle always decreases the cost of the
shortest path. This is why Bellman-Ford algorithm limits the number of iterations to (V-1). If we
https://riptutorial.com/ 63
used Dijkstra's Algorithm here, we'd be stuck in an endless loop. However, let's concentrate on
finding negative cycle.
Let's pick vertex 1 as the source. After applying Bellman-Ford's single source shortest path
algorithm to the graph, we'll find out the distances from the source to all the other vertices.
This is how the graph looks like after (V-1) = 3 iterations. It should be the result since there are 4
edges, we need at most 3 iterations to find out the shortest path. So either this is the answer, or
there is a negative weight cycle in the graph. To find that, after (V-1) iterations, we do one more
final iteration and if the distance continues to decrease, it means that there is definitely a negative
weight cycle in the graph.
For this example: if we check 2-3, d[2] + cost[2][3] will give us 1 which is less than d[3]. So we
can conclude that there is a negative cycle in our graph.
So how do we find out the negative cycle? We do a bit modification to Bellman-Ford procedure:
https://riptutorial.com/ 64
Procedure NegativeCycleDetector(Graph, source):
n := number of vertices in Graph
for i from 1 to n
d[i] := infinity
end for
d[source] := 0
for i from 1 to n-1
flag := false
for all edges from (u,v) in Graph
if d[u] + cost[u][v] < d[v]
d[v] := d[u] + cost[u][v]
flag := true
end if
end for
if flag == false
break
end for
for all edges from (u,v) in Graph
if d[u] + cost[u][v] < d[v]
Return "Negative Cycle Detected"
end if
end for
Return "No Negative Cycle"
This is how we find out if there is a negative cycle in a graph. We can also modify Bellman-Ford
Algorithm to keep track of negative cycles.
https://riptutorial.com/ 65
Chapter 9: Big-O Notation
Remarks
Definition
The Big-O notation is at its heart a mathematical notation, used to compare the rate of
convergence of functions. Let n -> f(n) and n -> g(n) be functions defined over the natural
numbers. Then we say that f = O(g) if and only if f(n)/g(n) is bounded when n approaches infinity.
In other words, f = O(g) if and only if there exists a constant A, such that for all n, f(n)/g(n) <= A.
Actually the scope of the Big-O notation is a bit wider in mathematics but for simplicity I have
narrowed it to what is used in algorithm complexity analysis : functions defined on the naturals,
that have non-zero values, and the case of n growing to infinity.
Let's take the case of f(n) = 100n^2 + 10n + 1 and g(n) = n^2. It is quite clear that both of these
functions tend to infinity as n tends to infinity. But sometimes knowing the limit is not enough, and
we also want to know the speed at which the functions approach their limit. Notions like Big-O help
compare and classify functions by their speed of convergence.
Let's find out if f = O(g) by applying the definition. We have f(n)/g(n) = 100 + 10/n + 1/n^2. Since
10/n is 10 when n is 1 and is decreasing, and since 1/n^2 is 1 when n is 1 and is also decreasing,
we have f̀ (n)/g(n) <= 100 + 10 + 1 = 111. The definition is satisfied because we have found a
bound of f(n)/g(n) (111) and so f = O(g) (we say that f is a Big-O of n^2).
This means that f tends to infinity at approximately the same speed as g. Now this may seem like
a strange thing to say, because what we have found is that f is at most 111 times bigger than g, or
in other words when g grows by 1, f grows by at most 111. It may seem that growing 111 times
faster is not "approximately the same speed". And indeed the Big-O notation is not a very precise
way to classify function convergence speed, which is why in mathematics we use the equivalence
relationship when we want a precise estimation of speed. But for the purposes of separating
algorithms in large speed classes, Big-O is enough. We don't need to separate functions that grow
a fixed number of times faster than each other, but only functions that grow infinitely faster than
each other. For instance if we take h(n) = n^2*log(n), we see that h(n)/g(n) = log(n) which tends
to infinity with n so h is not O(n^2), because h grows infinitely faster than n^2.
Now I need to make a side note : you might have noticed that if f = O(g) and g = O(h), then f =
O(h). For instance in our case, we have f = O(n^3), and f = O(n^4)... In algorithm complexity
analysis, we frequently say f = O(g) to mean that f = O(g) and g = O(f), which can be understood
as "g is the smallest Big-O for f". In mathematics we say that such functions are Big-Thetas of
each other.
How is it used ?
When comparing algorithm performance, we are interested in the number of operations that an
https://riptutorial.com/ 66
algorithm performs. This is called time complexity. In this model, we consider that each basic
operation (addition, multiplication, comparison, assignment, etc.) takes a fixed amount of time, and
we count the number of such operations. We can usually express this number as a function of the
size of the input, which we call n. And sadly, this number usually grows to infinity with n (if it
doesn't, we say that the algorithm is O(1)). We separate our algorithms in big speed classes
defined by Big-O : when we speak about a "O(n^2) algorithm", we mean that the number of
operations it performs, expressed as a function of n, is a O(n^2). This says that our algorithm is
approximately as fast as an algorithm that would do a number of operations equal to the square of
the size of its input, or faster. The "or faster" part is there because I used Big-O instead of Big-
Theta, but usually people will say Big-O to mean Big-Theta.
When counting operations, we usually consider the worst case: for instance if we have a loop that
can run at most n times and that contains 5 operations, the number of operations we count is 5n. It
is also possible to consider the average case complexity.
Quick note : a fast algorithm is one that performs few operations, so if the number of operations
grows to infinity faster, then the algorithm is slower: O(n) is better than O(n^2).
We are also sometimes interested in the space complexity of our algorithm. For this we consider
the number of bytes in memory occupied by the algorithm as a function of the size of the input,
and use Big-O the same way.
Examples
A Simple Loop
The input size is the size of the array, which I called len in the code.
These two assignments are done only once, so that's 2 operations. The operations that are looped
are:
https://riptutorial.com/ 67
i++;
max = array[i]
Since there are 3 operations in the loop, and the loop is done n times, we add 3n to our already
existing 2 operations to get 3n + 2. So our function takes 3n + 2 operations to find the max (its
complexity is 3n + 2). This is a polynomial where the fastest growing term is a factor of n, so it is
O(n).
You probably have noticed that "operation" is not very well defined. For instance I said that if (max
< array[i]) was one operation, but depending on the architecture this statement can compile to for
instance three instructions : one memory read, one comparison and one branch. I have also
considered all operations as the same, even though for instance the memory operations will be
slower than the others, and their performance will vary wildly due for instance to cache effects. I
also have completely ignored the return statement, the fact that a frame will be created for the
function, etc. In the end it doesn't matter to complexity analysis, because whatever way I choose
to count operations, it will only change the coefficient of the n factor and the constant, so the result
will still be O(n). Complexity shows how the algorithm scales with the size of the input, but it isn't
the only aspect of performance!
A Nested Loop
The following function checks if an array has any duplicates by taking each element, then iterating
over the whole array to see if the element is there
The inner loop performs at each iteration a number of operations that is constant with n. The outer
loop also does a few constant operations, and runs the inner loop n times. The outer loop itself is
run n times. So the operations inside the inner loop are run n^2 times, the operations in the outer
loop are run n times, and the assignment to i is done one time. Thus, the complexity will be
something like an^2 + bn + c, and since the highest term is n^2, the O notation is O(n^2).
As you may have noticed, we can improve the algorithm by avoiding doing the same comparisons
multiple times. We can start from i + 1 in the inner loop, because all elements before it will already
have been checked against all array elements, including the one at index i + 1. This allows us to
drop the i == j check.
https://riptutorial.com/ 68
return 1;
}
}
}
return 0;
}
Obviously, this second version does less operations and so is more efficient. How does that
translate to Big-O notation? Well, now the inner loop body is run 1 + 2 + ... + n - 1 = n(n-1)/2
times. This is still a polynomial of the second degree, and so is still only O(n^2). We have clearly
lowered the complexity, since we roughly divided by 2 the number of operations that we are doing,
but we are still in the same complexity class as defined by Big-O. In order to lower the complexity
to a lower class we would need to divide the number of operations by something that tends to
infinity with n.
An O(log n) example
Introduction
Consider the following problem:
L is a sorted list containing n signed integers (n being big enough), for example [-5, -2, -1, 0, 1,
2, 4] (here, n has a value of 7). If L is known to contain the integer 0, how can you find the index of
0?
Naïve approach
The first thing that comes to mind is to just read every index until 0 is found. In the worst case, the
number of operations is n, so the complexity is O(n).
This works fine for small values of n, but is there a more efficient way ?
Dichotomy
Consider the following algorithm (Python3):
a = 0
b = n-1
while True:
h = (a+b)//2 ## // is the integer division, so h is an integer
if L[h] == 0:
return h
elif L[h] > 0:
b = h
elif L[h] < 0:
a = h
a and b are the indexes between which 0 is to be found. Each time we enter the loop, we use an
index between a and b and use it to narrow the area to be searched.
https://riptutorial.com/ 69
In the worst case, we have to wait until a and b are equal. But how many operations does that
take? Not n, because each time we enter the loop, we divide the distance between a and b by
about two. Rather, the complexity is O(log n).
Explanation
Note: When we write "log", we mean the binary logarithm, or log base 2 (which we will write
"log_2"). As O(log_2 n) = O(log n) (you can do the math) we will use "log" instead of "log_2".
Conclusion
When faced with successive divisions (be it by two or by any number), remember that the
complexity is logarithmic.
Let's say we have a problem of size n. Now for each step of our algorithm(which we need write),
our original problem becomes half of its previous size(n/2).
Step Problem
1 n/2
2 n/4
3 n/8
4 n/16
When the problem space is reduced(i.e solved completely), it cannot be reduced any further(n
becomes equal to 1) after exiting check condition.
problem-size = 1
problem-size = n/2k
3. From 1 and 2:
https://riptutorial.com/ 70
n/2k = 1 or
n = 2k
loge n = k loge2
or
k = loge n / loge 2
k = log2 n
or simply k = log n
Now we know that our algorithm can run maximum up to log n, hence time complexity
comes as
O( log n)
So now if some one asks you if n is 256 how many steps that loop( or any other algorithm that cuts
down it's problem size into half) will run you can very easily calculate.
k = log2 256
k=8
Another very good example for similar case is Binary Search Algorithm.
while(low<=high){
mid=low+(high-low)/2;
if(arr[mid]==item)
return mid;
else if(arr[mid]<item)
low=mid+1;
else high=mid-1;
}
return –1;// Unsuccessful result
https://riptutorial.com/ 71
}
https://riptutorial.com/ 72
Chapter 10: Binary Search Trees
Introduction
Binary tree is a tree that each node in it has maximum of two children. Binary search tree (BST) is
a binary tree which its elements positioned in special order. In each BST all values(i.e key) in left
sub tree are less than values in right sub tree.
Examples
Binary Search Tree - Insertion (Python)
Following the code snippet each image shows the execution visualization which makes it easier to
visualize how this code works.
class Node:
def __init__(self, val):
self.l_child = None
self.r_child = None
self.data = val
https://riptutorial.com/ 73
def insert(root, node):
if root is None:
root = node
else:
if root.data > node.data:
if root.l_child is None:
root.l_child = node
else:
insert(root.l_child, node)
else:
if root.r_child is None:
root.r_child = node
else:
insert(root.r_child, node)
def in_order_print(root):
if not root:
return
in_order_print(root.l_child)
print root.data
in_order_print(root.r_child)
def pre_order_print(root):
if not root:
return
print root.data
pre_order_print(root.l_child)
pre_order_print(root.r_child)
https://riptutorial.com/ 74
Binary Search Tree - Deletion(C++)
Before starting with deletion I just want to put some lights on what is a Binary search tree(BST),
Each node in a BST can have maximum of two nodes(left and right child).The left sub-tree of a
node has a key less than or equal to its parent node's key. The right sub-tree of a node has a key
greater than to its parent node's key.
Deleting a node in a tree while maintaining its Binary search tree property.
Explanation of cases:
1. When the node to be deleted is a leaf node then simply delete the node and pass nullptr to
its parent node.
2. When a node to be deleted is having only one child then copy the child value to the node
value and delete the child (Converted to case 1)
https://riptutorial.com/ 75
3. When a node to be delete is having two childs then the minimum from its right sub tree can
be copied to the node and then the minimum value can be deleted from the node's right
subtree (Converted to Case 2)
Note: The minimum in the right sub tree can have a maximum of one child and that too right child
if it's having the left child that means it's not the minimum value or it's not following BST property.
struct node
{
int data;
node *left, *right;
};
else
{
if(root->left == nullptr && root->right == nullptr) // Case 1
{
free(root);
root = nullptr;
}
else if(root->left == nullptr) // Case 2
{
node* temp = root;
root= root->right;
free(temp);
}
else if(root->right == nullptr) // Case 2
{
node* temp = root;
root = root->left;
free(temp);
}
else // Case 3
{
node* temp = root->right;
root->data = temp->data;
root->right = delete_node(root->right, temp->data);
}
}
return root;
}
Time complexity of above code is O(h), where h is the height of the tree.
https://riptutorial.com/ 76
Consider the BST:
Binary search tree property can be used for finding nodes lowest ancestor
Psuedo code:
lowestCommonAncestor(root,node1, node2){
if(root == NULL)
return NULL;
return root;
}
else {
return lowestCommonAncestor(root->right, node1, node2);
}
}
class Node(object):
def __init__(self, val):
https://riptutorial.com/ 77
self.l_child = None
self.r_child = None
self.val = val
class BinarySearchTree(object):
def insert(self, root, node):
if root is None:
return node
return root
r = Node(3)
node = BinarySearchTree()
nodeList = [1, 8, 5, 12, 14, 6, 15, 7, 16, 8]
for nd in nodeList:
node.insert(r, Node(nd))
https://riptutorial.com/ 78
Chapter 11: Binary Tree traversals
Introduction
Visiting a node of a binary tree in some particular order is called traversals.
Examples
Pre-order, Inorder and Post Order traversal of a Binary Tree
Pre-order traversal(root) is traversing the node then left sub-tree of the node and then the right
sub-tree of the node.
1245367
In-order traversal(root) is traversing the left sub-tree of the node then the node and then right
sub-tree of the node.
4251637
Post-order traversal(root) is traversing the left sub-tree of the node then the right sub-tree and
then the node.
4526731
https://riptutorial.com/ 79
Level order traversal will be
1234567
Code:
#include<iostream>
#include<queue>
#include<malloc.h>
struct node{
int data;
node *left;
node *right;
};
queue<node *> Q;
Q.push(root);
while(!Q.empty()){
struct node* curr = Q.front();
cout<< curr->data <<" ";
if(curr->left != NULL) Q.push(curr-> left);
if(curr->right != NULL) Q.push(curr-> right);
Q.pop();
}
}
struct node* newNode(int data)
{
struct node* node = (struct node*)
malloc(sizeof(struct node));
node->data = data;
node->left = NULL;
node->right = NULL;
https://riptutorial.com/ 80
return(node);
}
int main(){
return 0;
https://riptutorial.com/ 81
Chapter 12: Breadth-First Search
Examples
Finding the Shortest Path from Source to other Nodes
Breadth-first-search (BFS) is an algorithm for traversing or searching tree or graph data structures.
It starts at the tree root (or some arbitrary node of a graph, sometimes referred to as a 'search
key') and explores the neighbor nodes first, before moving to the next level neighbors. BFS was
invented in the late 1950s by Edward Forrest Moore, who used it to find the shortest path out of a
maze and discovered independently by C. Y. Lee as a wire routing algorithm in 1961.
https://riptutorial.com/ 82
Let's assume this graph represents connection between multiple cities, where each node denotes
a city and an edge between two nodes denote there is a road linking them. We want to go from
node 1 to node 10. So node 1 is our source, which is level 0. We mark node 1 as visited. We
can go to node 2, node 3 and node 4 from here. So they'll be level (0+1) = level 1 nodes. Now
we'll mark them as visited and work with them.
The colored nodes are visited. The nodes that we're currently working with will be marked with
pink. We won't visit the same node twice. From node 2, node 3 and node 4, we can go to node 6,
node 7 and node 8. Let's mark them as visited. The level of these nodes will be level (1+1) =
level 2.
https://riptutorial.com/ 83
If you haven't noticed, the level of nodes simply denote the shortest path distance from the source
. For example: we've found node 8 on level 2. So the distance from source to node 8 is 2.
We didn't yet reach our target node, that is node 10. So let's visit the next nodes. we can directly
go to from node 6, node 7 and node 8.
https://riptutorial.com/ 84
We can see that, we found node 10 at level 3. So the shortest path from source to node 10 is 3.
We searched the graph level by level and found the shortest path. Now let's erase the edges that
we didn't use:
https://riptutorial.com/ 85
After removing the edges that we didn't use, we get a tree called BFS tree. This tree shows the
shortest path from source to all other nodes.
So our task will be, to go from source to level 1 nodes. Then from level 1 to level 2 nodes and so
on until we reach our destination. We can use queue to store the nodes that we are going to
process. That is, for each node we're going to work with, we'll push all other nodes that can be
directly traversed and not yet traversed in the queue.
First we push the source in the queue. Our queue will look like:
front
+-----+
| 1 |
+-----+
The level of node 1 will be 0. level[1] = 0. Now we start our BFS. At first, we pop a node from our
queue. We get node 1. We can go to node 4, node 3 and node 2 from this one. We've reached
these nodes from node 1. So level[4] = level[3] = level[2] = level[1] + 1 = 1. Now we mark them
as visited and push them in the queue.
front
https://riptutorial.com/ 86
+-----+ +-----+ +-----+
| 2 | | 3 | | 4 |
+-----+ +-----+ +-----+
Now we pop node 4 and work with it. We can go to node 7 from node 4. level[7] = level[4] + 1 =
2. We mark node 7 as visited and push it in the queue.
front
+-----+ +-----+ +-----+
| 7 | | 2 | | 3 |
+-----+ +-----+ +-----+
From node 3, we can go to node 7 and node 8. Since we've already marked node 7 as visited,
we mark node 8 as visited, we change level[8] = level[3] + 1 = 2. We push node 8 in the queue.
front
+-----+ +-----+ +-----+
| 6 | | 7 | | 2 |
+-----+ +-----+ +-----+
This process will continue till we reach our destination or the queue becomes empty. The level
array will provide us with the distance of the shortest path from source. We can initialize level
array with infinity value, which will mark that the nodes are not yet visited. Our pseudo-code will
be:
By iterating through the level array, we can find out the distance of each node from source. For
example: the distance of node 10 from source will be stored in level[10].
Sometimes we might need to print not only the shortest distance, but also the path via which we
can go to our destined node from the source. For this we need to keep a parent array.
parent[source] will be NULL. For each update in level array, we'll simply add parent[v] := u in
our pseudo code inside the for loop. After finishing BFS, to find the path, we'll traverse back the
parent array until we reach source which will be denoted by NULL value. The pseudo-code will
be:
https://riptutorial.com/ 87
if parent[u] is not equal to null | S = Stack()
PrintPath(parent[u]) | while parent[u] is not equal to null
end if | S.push(u)
print -> u | u := parent[u]
| end while
| while S is not empty
| print -> S.pop
| end while
Complexity:
We've visited every node once and every edges once. So the complexity will be O(V + E) where V
is the number of nodes and E is the number of edges.
Most of the time, we'll need to find out the shortest path from single source to all other nodes or a
specific node in a 2D graph. Say for example: we want to find out how many moves are required
for a knight to reach a certain square in a chessboard, or we have an array where some cells are
blocked, we have to find out the shortest path from one cell to another. We can move only
horizontally and vertically. Even diagonal moves can be possible too. For these cases, we can
convert the squares or cells in nodes and solve these problems easily using BFS. Now our visited
, parent and level will be 2D arrays. For each node, we'll consider all possible moves. To find the
distance to a specific node, we'll also check whether we have reached our destination.
There will be one additional thing called direction array. This will simply store the all possible
combinations of directions we can go to. Let's say, for horizontal and vertical moves, our direction
arrays will be:
+----+-----+-----+-----+-----+
| dx | 1 | -1 | 0 | 0 |
+----+-----+-----+-----+-----+
| dy | 0 | 0 | 1 | -1 |
+----+-----+-----+-----+-----+
Here dx represents move in x-axis and dy represents move in y-axis. Again this part is optional.
You can also write all the possible combinations separately. But it's easier to handle it using
direction array. There can be more and even different combinations for diagonal moves or knight
moves.
• If any of the cell is blocked, for every possible moves, we'll check if the cell is blocked or not.
• We'll also check if we have gone out of bounds, that is we've crossed the array boundaries.
• The number of rows and columns will be given.
https://riptutorial.com/ 88
visited[i][j] := false
end for
end for
visited[source.x][source.y] := true
level[source.x][source.y] := 0
Q = queue()
Q.push(source)
m := dx.size
while Q is not empty
top := Q.pop
for i from 1 to m
temp.x := top.x + dx[i]
temp.y := top.y + dy[i]
if temp is inside the row and column and top doesn't equal to blocksign
visited[temp.x][temp.y] := true
level[temp.x][temp.y] := level[top.x][top.y] + 1
Q.push(temp)
end if
end for
end while
Return level
As we have discussed earlier, BFS only works for unweighted graphs. For weighted graphs, we'll
need Dijkstra's algorithm. For negative edge cycles, we need Bellman-Ford's algorithm. Again this
algorithm is single source shortest path algorithm. If we need to find out distance from each nodes
to all other nodes, we'll need Floyd-Warshall's algorithm.
BFS can be used to find the connected components of an undirected graph. We can also find if
the given graph is connected or not. Our subsequent discussion assumes we are dealing with
undirected graphs.The definition of a connected graph is:
https://riptutorial.com/ 89
Following graph is not connected and has 2 connected components:
BFS is a graph traversal algorithm. So starting from a random source node, if on termination of
algorithm, all nodes are visited, then the graph is connected,otherwise it is not connected.
boolean isConnected(Graph g)
{
BFS(v)//v is a random source node.
if(allVisited(g))
{
return true;
}
else return false;
}
#include<stdio.h>
#include<stdlib.h>
#define MAXVERTICES 100
void enqueue(int);
int deque();
int isConnected(char **graph,int noOfVertices);
void BFS(char **graph,int vertex,int noOfVertices);
int count = 0;
//Queue node depicts a single Queue element
//It is NOT a graph node.
struct node
{
int v;
struct node *next;
};
https://riptutorial.com/ 90
typedef struct node Node;
typedef struct node *Nodeptr;
int main()
{
int n,e;//n is number of vertices, e is number of edges.
int i,j;
char **graph;//adjacency matrix
if(isConnected(graph,n))
printf("The graph is connected");
else printf("The graph is NOT connected\n");
}
https://riptutorial.com/ 91
Nodeptr newNode = malloc(sizeof(Node));
newNode->v = vertex;
newNode->next = NULL;
Qrear->next = newNode;
Qrear = newNode;
}
}
int deque()
{
if(Qfront == NULL)
{
printf("Q is empty , returning -1\n");
return -1;
}
else
{
int v = Qfront->v;
Nodeptr temp= Qfront;
if(Qfront == Qrear)
{
Qfront = Qfront->next;
Qrear = NULL;
}
else
Qfront = Qfront->next;
free(temp);
return v;
}
}
https://riptutorial.com/ 92
For Finding all the Connected components of an undirected graph, we only need to add 2 lines of
code to the BFS function. The idea is to call BFS function until all vertices are visited.
AND
printf("%d ",vertex+1);
add this as first line of while loop in BFS
}
}
https://riptutorial.com/ 93
Chapter 13: Bubble Sort
Parameters
Parameter Description
Stable Yes
In place Yes
Examples
Bubble Sort
The BubbleSort compares each successive pair of elements in an unordered list and inverts the
elements if they are not in order.
The following example illustrates the bubble sort on the list {6,5,3,1,8,7,2,4} (pairs that were
compared in each step are encapsulated in '**'):
{6,5,3,1,8,7,2,4}
{**5,6**,3,1,8,7,2,4} -- 5 < 6 -> swap
{5,**3,6**,1,8,7,2,4} -- 3 < 6 -> swap
{5,3,**1,6**,8,7,2,4} -- 1 < 6 -> swap
{5,3,1,**6,8**,7,2,4} -- 8 > 6 -> no swap
{5,3,1,6,**7,8**,2,4} -- 7 < 8 -> swap
{5,3,1,6,7,**2,8**,4} -- 2 < 8 -> swap
{5,3,1,6,7,2,**4,8**} -- 4 < 8 -> swap
After one iteration through the list, we have {5,3,1,6,7,2,4,8}. Note that the greatest unsorted
value in the array (8 in this case) will always reach its final position. Thus, to be sure the list is
sorted we must iterate n-1 times for lists of length n.
Graphic:
https://riptutorial.com/ 94
Implementation in Javascript
function bubbleSort(a)
{
var swapped;
do {
swapped = false;
for (var i=0; i < a.length-1; i++) {
if (a[i] > a[i+1]) {
var temp = a[i];
a[i] = a[i+1];
a[i+1] = temp;
swapped = true;
}
}
} while (swapped);
}
var a = [3, 203, 34, 746, 200, 984, 198, 764, 9];
bubbleSort(a);
console.log(a); //logs [ 3, 9, 34, 198, 200, 203, 746, 764, 984 ]
Implementation in C#
Bubble sort is also known as Sinking Sort. It is a simple sorting algorithm that repeatedly steps
through the list to be sorted, compares each pair of adjacent items and swaps them if they are in
the wrong order.
https://riptutorial.com/ 95
Implementation of Bubble Sort
I used C# language to implement bubble sort algorithm
void bubbleSort(vector<int>numbers)
{
for(int i = numbers.size() - 1; i >= 0; i--) {
for(int j = 1; j <= i; j++) {
if(numbers[j-1] > numbers[j]) {
swap(numbers[j-1],numbers(j));
}
}
}
}
https://riptutorial.com/ 96
C Implementation
t = list[d];
list[d] = list[d+1];
list[d+1] = t;
}
}
}
}
t = * (list + d );
* (list + d ) = * (list + d + 1 );
* (list + d + 1) = t;
}
}
}
}
Implementation in Java
https://riptutorial.com/ 97
}
printNumbers(array);
}
}
int temp;
temp = array[i];
array[i] = array[j];
array[j] = temp;
}
}
}
Python Implementation
#!/usr/bin/python
input_list = [10,1,2,11]
for i in range(len(input_list)):
for j in range(i):
if int(input_list[j]) > int(input_list[j+1]):
input_list[j],input_list[j+1] = input_list[j+1],input_list[j]
print input_list
https://riptutorial.com/ 98
Chapter 14: Bucket Sort
Examples
Bucket Sort Basic Information
Bucket Sort is a sorting algorithm in which elements of input array are distributed in buckets. After
distributing all the elements, buckets are sorted individually by another sorting algorithm.
Sometimes it is also sorted by recursive method.
C# Implementation
https://riptutorial.com/ 99
public class BucketSort
{
public static void SortBucket(ref int[] input)
{
int minValue = input[0];
int maxValue = input[0];
int k = 0;
https://riptutorial.com/ 100
Chapter 15: Catalan Number Algorithm
Examples
Catalan Number Algorithm Basic Information
In combinatorial mathematics, the Catalan numbers form a sequence of natural numbers that
occur in various counting problems, often involving recursively-defined objects. The Catalan
numbers on nonnegative integers n are a set of numbers that arise in tree enumeration problems
of the type, 'In how many ways can a regular n-gon be divided into n-2 triangles if different
orientations are counted separately?'
1. The number of ways to stack coins on a bottom row that consists of n consecutive coins in a
plane, such that no coins are allowed to be put on the two sides of the bottom coins and
every additional coin must be above two other coins, is the nth Catalan number.
2. The number of ways to group a string of n pairs of parentheses, such that each open
parenthesis has a matching closed parenthesis, is the nth Catalan number.
3. The number of ways to cut an n+2-sided convex polygon in a plane into triangles by
connecting vertices with straight, non-intersecting lines is the nth Catalan number. This is the
application in which Euler was interested.
Using zero-based numbering, the nth Catalan number is given directly in terms of binomial
coefficients by the following equation.
https://riptutorial.com/ 101
Auxiliary Space: O(n)
Time Complexity: O(n^2)
C# Implementation
https://riptutorial.com/ 102
Chapter 16: Check if a tree is BST or not
Examples
If a given input tree follows Binary search tree property or not
For example
1. It is empty
2. It has no subtrees
3. For every node x in the tree all the keys (if any) in the left sub tree must be less than key(x)
and all the keys (if any) in the right sub tree must be greater than key(x).
is_BST(root):
if root == NULL:
return true
https://riptutorial.com/ 103
// Check values in left subtree
if root->left != NULL:
max_key_in_left = find_max_key(root->left)
if max_key_in_left > root->key:
return false
The above recursive algorithm is correct but inefficient, because it traverses each node mutiple
times.
Another approach to minimize the multiple visits of each node is to remember the min and max
possible values of the keys in the subtree we are visiting. Let the minimum possible value of any
key be K_MIN and maximum value be K_MAX. When we start from the root of the tree, the range of
values in the tree is [K_MIN,K_MAX]. Let the key of root node be x. Then the range of values in left
subtree is [K_MIN,x) and the range of values in right subtree is (x,K_MAX]. We will use this idea to
develop a more efficient algorithm.
is_BST(my_tree_root,KEY_MIN,KEY_MAX)
Another approach will be to do inorder traversal of the Binary tree. If the inorder traversal produces
a sorted sequence of keys then the given tree is a BST. To check if the inorder sequence is sorted
remember the value of previously visited node and compare it against the current node.
https://riptutorial.com/ 104
Chapter 17: Check two strings are anagrams
Introduction
Two string with same set of character is called anagram. I have used javascript here.
We will create an hash of str1 and increase count +1. We will loop on 2nd string and check all
characters are there in hash and decrease value of hash key. Check all value of hash key are zero
will be anagram.
Examples
Sample input and output
Ex1:-
hashMap = {
s : 1,
t : 1,
a : 1,
c : 1,
k : 1,
o : 2,
v : 1,
e : 1,
r : 1,
f : 1,
l : 1,
w : 1
}
You can see hashKey 'o' is containing value 2 because o is 2 times in string.
Now loop over str2 and check for each character are present in hashMap, if yes, decrease value
of hashMap Key, else return false (which indicate it's not anagram).
hashMap = {
s : 0,
t : 0,
a : 0,
c : 0,
k : 0,
o : 0,
https://riptutorial.com/ 105
v : 0,
e : 0,
r : 0,
f : 0,
l : 0,
w : 0
}
Now, loop over hashMap object and check all values are zero in the key of hashMap.
(function(){
// Create hash map of str1 character and increase value one (+1).
createStr1HashMap(str1);
// Check str2 character are key in hash map and decrease value by one(-1);
var valueExist = createStr2HashMap(str2);
https://riptutorial.com/ 106
break;
} else {
isAnagram = true;
}
}
return isAnagram;
}
}
})();
https://riptutorial.com/ 107
Chapter 18: Counting Sort
Examples
Counting Sort Basic Information
Counting sort is an integer sorting algorithm for a collection of objects that sorts according to the
keys of the objects.
Steps
1. Construct a working array C that has size equal to the range of the input array A.
2. Iterate through A, assigning C[x] based on the number of times x appeared in A.
3. Transform C into an array where C[x] refers to the number of values ≤ x by iterating through
the array, assigning to each C[x] the sum of its prior value and all values in C that come
before it.
4. Iterate backwards through A, placing each value in to a new sorted array B at the index
recorded in C. This is done for a given A[x] by assigning B[C[A[x]]] to A[x], and decrementing
C[A[x]] in case there were duplicate values in the original unsorted array.
Psuedocode Implementation
Constraints:
https://riptutorial.com/ 108
Pseudocode:
for x in input:
count[key(x)] += 1
total = 0
for i in range(k):
oldCount = count[i]
count[i] = total
total += oldCount
for x in input:
output[count[key(x)]] = x
count[key(x)] += 1
return output
C# Implementation
https://riptutorial.com/ 109
Chapter 19: Cycle Sort
Examples
Cycle Sort Basic Information
Cycle Sort is sorting algorithm which uses comparison sort that is theoretically optimal in terms of
the total number of writes to original array, unlike any other in-place sorting algorithm. Cycle sort is
unstable sorting algorithm. It is based on idea of permutation in which permutations are factored
into cycles, which individually rotate and return a sorted output.
Pseudocode Implementation
(input)
output = 0
for cycleStart from 0 to length(array) - 2
item = array[cycleStart]
pos = cycleStart
for i from cycleStart + 1 to length(array) - 1
if array[i] < item:
pos += 1
if pos == cycleStart:
continue
while item == array[pos]:
pos += 1
array[pos], item = item, array[pos]
writes += 1
https://riptutorial.com/ 110
while pos != cycleStart:
pos = cycleStart
for i from cycleStart + 1 to length(array) - 1
if array[i] < item:
pos += 1
while item == array[pos]:
pos += 1
array[pos], item = item, array[pos]
writes += 1
return outout
C# Implementation
https://riptutorial.com/ 111
Chapter 20: Depth First Search
Examples
Introduction To Depth-First Search
Depth-first search is an algorithm for traversing or searching tree or graph data structures. One
starts at the root and explores as far as possible along each branch before backtracking. A version
of depth-first search was investigated in the 19th century French mathematician Charles Pierre
Trémaux as a strategy for solving mazes.
Depth-first search is a systematic way to find all the vertices reachable from a source vertex. Like
breadth-first search, DFS traverse a connected component of a given graph and defines a
spanning tree. The basic idea of depth-first search is methodically exploring every edge. We start
over from a different vertices as necessary. As soon as we discover a vertex, DFS starts exploring
from it (unlike BFS, which puts a vertex on a queue so that it explores from it later).
https://riptutorial.com/ 112
https://riptutorial.com/ 113
https://riptutorial.com/ 114
We can see one important keyword. That is backedge. You can see. 5-1 is called backedge. This
is because, we're not yet done with node-1, so going from another node to node-1 means there's
a cycle in the graph. In DFS, if we can go from one gray node to another, we can be certain that
the graph has a cycle. This is one of the ways of detecting cycle in a graph. Depending on source
node and the order of the nodes we visit, we can find out any edge in a cycle as backedge. For
example: if we went to 5 from 1 first, we'd have found out 2-1 as backedge.
The edge that we take to go from gray node to white node are called tree edge. If we only keep
the tree edge's and remove others, we'll get DFS tree.
In undirected graph, if we can visit a already visited node, that must be a backedge. But for
directed graphs, we must check the colors. If and only if we can go from one gray node to another
https://riptutorial.com/ 115
gray node, that is called a backedge.
In DFS, we can also keep timestamps for each node, which can be used in many ways (e.g.:
Topological Sort).
1. When a node v is changed from white to gray the time is recorded in d[v].
2. When a node v is changed from gray to black the time is recorded in f[v].
Here d[] means discovery time and f[] means finishing time. Our pesudo-code will look like:
Procedure DFS(G):
for each node u in V[G]
color[u] := white
parent[u] := NULL
end for
time := 0
for each node u in V[G]
if color[u] == white
DFS-Visit(u)
end if
end for
Procedure DFS-Visit(u):
color[u] := gray
time := time + 1
d[u] := time
for each node v adjacent to u
if color[v] == white
parent[v] := u
DFS-Visit(v)
end if
end for
color[u] := black
time := time + 1
f[u] := time
Complexity:
Each nodes and edges are visited once. So the complexity of DFS is O(V+E), where V denotes
the number of nodes and E denotes the number of edges.
https://riptutorial.com/ 116
Chapter 21: Dijkstra’s Algorithm
Examples
Dijkstra's Shortest Path Algorithm
Before proceeding, it is recommended to have a brief idea about Adjacency Matrix and BFS
Dijkstra's algorithm is known as single-source shortest path algorithm. It is used for finding the
shortest paths between nodes in a graph, which may represent, for example, road networks. It
was conceived by Edsger W. Dijkstra in 1956 and published three years later.
We can find shortest path using Breadth First Search (BFS) searching algorithm. This algorithm
works fine, but the problem is, it assumes the cost of traversing each path is same, that means the
cost of each edge is same. Dijkstra's algorithm helps us to find the shortest path where the cost of
each path is not the same.
At first we will see, how to modify BFS to write Dijkstra's algorithm, then we will add priority queue
to make it a complete Dijkstra's algorithm.
Let's say, the distance of each node from the source is kept in d[] array. As in, d[3] represents
that d[3] time is taken to reach node 3 from source. If we don't know the distance, we will store
infinity in d[3]. Also, let cost[u][v] represent the cost of u-v. That means it takes cost[u][v] to go
from u node to v node.
We need to understand Edge Relaxation. Let's say, from your house, that is source, it takes 10
minutes to go to place A. And it takes 25 minutes to go to place B. We have,
d[A] = 10
d[B] = 25
Now let's say it takes 7 minutes to go from place A to place B, that means:
cost[A][B] = 7
Then we can go to place B from source by going to place A from source and then from place A,
going to place B, which will take 10 + 7 = 17 minutes, instead of 25 minutes. So,
https://riptutorial.com/ 117
Then we update,
This is called relaxation. We will go from node u to node v and if d[u] + cost[u][v] < d[v] then we
will update d[v] = d[u] + cost[u][v].
In BFS, we didn't need to visit any node twice. We only checked if a node is visited or not. If it was
not visited, we pushed the node in queue, marked it as visited and incremented the distance by 1.
In Dijkstra, we can push a node in queue and instead of updating it with visited, we relax or update
the new edge. Let's look at one example:
d[1] = 0
d[2] = d[3] = d[4] = infinity (or a large value)
We set, d[2], d[3] and d[4] to infinity because we don't know the distance yet. And the distance of
source is of course 0. Now, we go to other nodes from source and if we can update them, then
we'll push them in the queue. Say for example, we'll traverse edge 1-2. As d[1] + 2 < d[2] which
will make d[2] = 2. Similarly, we'll traverse edge 1-3 which makes d[3] = 5.
We can clearly see that 5 is not the shortest distance we can cross to go to node 3. So traversing
a node only once, like BFS, doesn't work here. If we go from node 2 to node 3 using edge 2-3,
we can update d[3] = d[2] + 1 = 3. So we can see that one node can be updated many times.
How many times you ask? The maximum number of times a node can be updated is the number
of in-degree of a node.
Let's see the pseudo-code for visiting any node multiple times. We will simply modify BFS:
https://riptutorial.com/ 118
procedure BFSmodified(G, source):
Q = queue()
distance[] = infinity
Q.enqueue(source)
distance[source]=0
while Q is not empty
u <- Q.pop()
for all edges from u to v in G.adjacentEdges(v) do
if distance[u] + cost[u][v] < distance[v]
distance[v] = distance[u] + cost[u][v]
end if
end for
end while
Return distance
This can be used to find the shortest path of all node from the source. The complexity of this code
is not so good. Here's why,
In BFS, when we go from node 1 to all other nodes, we follow first come, first serve method. For
example, we went to node 3 from source before processing node 2. If we go to node 3 from
source, we update node 4 as 5 + 3 = 8. When we again update node 3 from node 2, we need to
update node 4 as 3 + 3 = 6 again! So node 4 is updated twice.
Dijkstra proposed, instead of going for First come, first serve method, if we update the nearest
nodes first, then it'll take less updates. If we processed node 2 before, then node 3 would have
been updated before, and after updating node 4 accordingly, we'd easily get the shortest distance!
The idea is to choose from the queue, the node, that is closest to the source. So we will use
Priority Queue here so that when we pop the queue, it will bring us the closest node u from
source. How will it do that? It'll check the value of d[u] with it.
The pseudo-code returns distance of all other nodes from the source. If we want to know distance
of a single node v, we can simply return the value when v is popped from the queue.
Now, does Dijkstra's Algorithm work when there's a negative edge? If there's a negative cycle,
then infinity loop will occur, as it will keep reducing the cost every time. Even if there is a negative
https://riptutorial.com/ 119
edge, Dijkstra won't work, unless we return right after the target is popped. But then, it won't be a
Dijkstra algorithm. We'll need Bellman–Ford algorithm for processing negative edge/cycle.
Complexity:
The complexity of BFS is O(log(V+E)) where V is the number of nodes and E is the number of
edges. For Dijkstra, the complexity is similar, but sorting of Priority Queue takes O(logV). So the
total complexity is: O(Vlog(V)+E)
Below is a Java example to solve Dijkstra's Shortest Path Algorithm using Adjacency Matrix
import java.util.*;
import java.lang.*;
import java.io.*;
class ShortestPath
{
static final int V=9;
int minDistance(int dist[], Boolean sptSet[])
{
int min = Integer.MAX_VALUE, min_index=-1;
return min_index;
}
dist[src] = 0;
sptSet[u] = true;
https://riptutorial.com/ 120
if (!sptSet[v] && graph[u][v]!=0 &&
dist[u] != Integer.MAX_VALUE &&
dist[u]+graph[u][v] < dist[v])
dist[v] = dist[u] + graph[u][v];
}
printSolution(dist, V);
}
https://riptutorial.com/ 121
Chapter 22: Dynamic Programming
Introduction
Dynamics programming is a widely used concept and its often used for optimization. It refers to
simplifying a complicated problem by breaking it down into simpler sub-problems in a recursive
manner usually Bottom up approach. There are two key attributes that a problem must have in
order for dynamic programming to be applicable "Optimal substructure" and "Overlapping sub-
problems".To achieve its optimization, Dynamics programming uses a concept called
Memorization
Remarks
Dynamic Programming is an improvement on Brute Force, see this example to understand how
one can obtain a Dynamic Programming solution from Brute Force.
1. Overlapping Problems
2. Optimal Substructure
Overlapping Subproblems means that results of smaller versions of the problem are reused
multiple times in order to arrive at the solution to the original problem
Optimal Substructure means that there is a method of calculating a problem from its
subproblems.
A Dynamic Programming Solution has 2 main components, the State and the Transition
The time taken by a Dynamic Programming Solution can be calculated as No. of States *
Transition Time. Thus if a solution has N^2 states and the transition is O(N), then the solution would
take roughly O(N^3) time.
Examples
Knapsack Problem
0-1 Knapsack
The Knapsack Problem is a problem when given a set of items, each with a weight, a value and
https://riptutorial.com/ 122
exactly 1 copy, determine the which item(s) to include in a collection so that the total weight is
less than or equal to a given limit and the total value is as large as possible.
C++ Example:
Implementation:
Test:
3 5
5 2
2 1
3 2
Output:
That means the maximum value can be achieved is 3, which is achieved by choosing (2,1) and
(3,2).
Unbounded Knapsack
The Unbounded Knapsack Problem is a problem which given a set of items, each with a weight, a
value and infinite copies, determine the number of each item to include in a collection so that the
total weight is less than or equal to a given limit and the total value is as large as possible.
Python(2.7.11) Example:
Implementation:
https://riptutorial.com/ 123
if wi > r:
continue
val = max(val, v[i] + m[r-wi])
m.append(val)
return m[c] # return the maximum value can be achieved
Test:
w = [2, 3, 4, 5, 6]
v = [2, 4, 6, 8, 9]
Output:
20
That means the maximum value can be achieved is 20, which is achieved by choosing (5, 8), (5,
8) and (3, 4).
Weighted Job Scheduling Algorithm can also be denoted as Weighted Activity Selection Algorithm.
The problem is, given certain jobs with their start time and end time, and a profit you make when
you finish the job, what is the maximum profit you can make given no two jobs can be executed in
parallel?
This one looks like Activity Selection using Greedy Algorithm, but there's an added twist. That is,
instead of maximizing the number of jobs finished, we focus on making the maximum profit. The
number of jobs performed doesn't matter here.
+-------------------------+---------+---------+---------+---------+---------+---------+
| Name | A | B | C | D | E | F |
+-------------------------+---------+---------+---------+---------+---------+---------+
|(Start Time, Finish Time)| (2,5) | (6,7) | (7,9) | (1,3) | (5,8) | (4,6) |
+-------------------------+---------+---------+---------+---------+---------+---------+
| Profit | 6 | 4 | 2 | 5 | 11 | 5 |
+-------------------------+---------+---------+---------+---------+---------+---------+
The jobs are denoted with a name, their start and finishing time and profit. After a few iterations,
we can find out if we perform Job-A and Job-E, we can get the maximum profit of 17. Now how to
find this out using an algorithm?
The first thing we do is sort the jobs by their finishing time in non-decreasing order. Why do we do
this? It's because if we select a job that takes less time to finish, then we leave the most amount of
time for choosing other jobs. We have:
https://riptutorial.com/ 124
+-------------------------+---------+---------+---------+---------+---------+---------+
| Name | D | A | F | B | E | C |
+-------------------------+---------+---------+---------+---------+---------+---------+
|(Start Time, Finish Time)| (1,3) | (2,5) | (4,6) | (6,7) | (5,8) | (7,9) |
+-------------------------+---------+---------+---------+---------+---------+---------+
| Profit | 5 | 6 | 5 | 4 | 11 | 2 |
+-------------------------+---------+---------+---------+---------+---------+---------+
We'll have an additional temporary array Acc_Prof of size n (Here, n denotes the total number of
jobs). This will contain the maximum accumulated profit of performing the jobs. Don't get it? Wait
and watch. We'll initialize the values of the array with the profit of each jobs. That means,
Acc_Prof[i] will at first hold the profit of performing i-th job.
+-------------------------+---------+---------+---------+---------+---------+---------+
| Acc_Prof | 5 | 6 | 5 | 4 | 11 | 2 |
+-------------------------+---------+---------+---------+---------+---------+---------+
Now let's denote position 2 with i, and position 1 will be denoted with j. Our strategy will be to
iterate j from 1 to i-1 and after each iteration, we will increment i by 1, until i becomes n+1.
j i
+-------------------------+---------+---------+---------+---------+---------+---------+
| Name | D | A | F | B | E | C |
+-------------------------+---------+---------+---------+---------+---------+---------+
|(Start Time, Finish Time)| (1,3) | (2,5) | (4,6) | (6,7) | (5,8) | (7,9) |
+-------------------------+---------+---------+---------+---------+---------+---------+
| Profit | 5 | 6 | 5 | 4 | 11 | 2 |
+-------------------------+---------+---------+---------+---------+---------+---------+
| Acc_Prof | 5 | 6 | 5 | 4 | 11 | 2 |
+-------------------------+---------+---------+---------+---------+---------+---------+
We check if Job[i] and Job[j] overlap, that is, if the finish time of Job[j] is greater than Job[i]'s
start time, then these two jobs can't be done together. However, if they don't overlap, we'll check if
Acc_Prof[j] + Profit[i] > Acc_Prof[i]. If this is the case, we will update Acc_Prof[i] = Acc_Prof[j]
+ Profit[i]. That is:
Here Acc_Prof[j] + Profit[i] represents the accumulated profit of doing these two jobs toegther.
Let's check it for our example:
Here Job[j] overlaps with Job[i]. So these to can't be done together. Since our j is equal to i-1, we
increment the value of i to i+1 that is 3. And we make j = 1.
j i
+-------------------------+---------+---------+---------+---------+---------+---------+
| Name | D | A | F | B | E | C |
https://riptutorial.com/ 125
+-------------------------+---------+---------+---------+---------+---------+---------+
|(Start Time, Finish Time)| (1,3) | (2,5) | (4,6) | (6,7) | (5,8) | (7,9) |
+-------------------------+---------+---------+---------+---------+---------+---------+
| Profit | 5 | 6 | 5 | 4 | 11 | 2 |
+-------------------------+---------+---------+---------+---------+---------+---------+
| Acc_Prof | 5 | 6 | 5 | 4 | 11 | 2 |
+-------------------------+---------+---------+---------+---------+---------+---------+
Now Job[j] and Job[i] don't overlap. The total amount of profit we can make by picking these two
jobs is: Acc_Prof[j] + Profit[i] = 5 + 5 = 10 which is greater than Acc_Prof[i]. So we update
Acc_Prof[i] = 10. We also increment j by 1. We get,
j i
+-------------------------+---------+---------+---------+---------+---------+---------+
| Name | D | A | F | B | E | C |
+-------------------------+---------+---------+---------+---------+---------+---------+
|(Start Time, Finish Time)| (1,3) | (2,5) | (4,6) | (6,7) | (5,8) | (7,9) |
+-------------------------+---------+---------+---------+---------+---------+---------+
| Profit | 5 | 6 | 5 | 4 | 11 | 2 |
+-------------------------+---------+---------+---------+---------+---------+---------+
| Acc_Prof | 5 | 6 | 10 | 4 | 11 | 2 |
+-------------------------+---------+---------+---------+---------+---------+---------+
Here, Job[j] overlaps with Job[i] and j is also equal to i-1. So we increment i by 1, and make j = 1
. We get,
j i
+-------------------------+---------+---------+---------+---------+---------+---------+
| Name | D | A | F | B | E | C |
+-------------------------+---------+---------+---------+---------+---------+---------+
|(Start Time, Finish Time)| (1,3) | (2,5) | (4,6) | (6,7) | (5,8) | (7,9) |
+-------------------------+---------+---------+---------+---------+---------+---------+
| Profit | 5 | 6 | 5 | 4 | 11 | 2 |
+-------------------------+---------+---------+---------+---------+---------+---------+
| Acc_Prof | 5 | 6 | 10 | 4 | 11 | 2 |
+-------------------------+---------+---------+---------+---------+---------+---------+
Now, Job[j] and Job[i] don't overlap, we get the accumulated profit 5 + 4 = 9, which is greater
than Acc_Prof[i]. We update Acc_Prof[i] = 9 and increment j by 1.
j i
+-------------------------+---------+---------+---------+---------+---------+---------+
| Name | D | A | F | B | E | C |
+-------------------------+---------+---------+---------+---------+---------+---------+
|(Start Time, Finish Time)| (1,3) | (2,5) | (4,6) | (6,7) | (5,8) | (7,9) |
+-------------------------+---------+---------+---------+---------+---------+---------+
| Profit | 5 | 6 | 5 | 4 | 11 | 2 |
+-------------------------+---------+---------+---------+---------+---------+---------+
| Acc_Prof | 5 | 6 | 10 | 9 | 11 | 2 |
+-------------------------+---------+---------+---------+---------+---------+---------+
Again Job[j] and Job[i] don't overlap. The accumulated profit is: 6 + 4 = 10, which is greater than
https://riptutorial.com/ 126
Acc_Prof[i]. We again update Acc_Prof[i] = 10. We increment j by 1. We get:
j i
+-------------------------+---------+---------+---------+---------+---------+---------+
| Name | D | A | F | B | E | C |
+-------------------------+---------+---------+---------+---------+---------+---------+
|(Start Time, Finish Time)| (1,3) | (2,5) | (4,6) | (6,7) | (5,8) | (7,9) |
+-------------------------+---------+---------+---------+---------+---------+---------+
| Profit | 5 | 6 | 5 | 4 | 11 | 2 |
+-------------------------+---------+---------+---------+---------+---------+---------+
| Acc_Prof | 5 | 6 | 10 | 10 | 11 | 2 |
+-------------------------+---------+---------+---------+---------+---------+---------+
If we continue this process, after iterating through the whole table using i, our table will finally look
like:
+-------------------------+---------+---------+---------+---------+---------+---------+
| Name | D | A | F | B | E | C |
+-------------------------+---------+---------+---------+---------+---------+---------+
|(Start Time, Finish Time)| (1,3) | (2,5) | (4,6) | (6,7) | (5,8) | (7,9) |
+-------------------------+---------+---------+---------+---------+---------+---------+
| Profit | 5 | 6 | 5 | 4 | 11 | 2 |
+-------------------------+---------+---------+---------+---------+---------+---------+
| Acc_Prof | 5 | 6 | 10 | 14 | 17 | 8 |
+-------------------------+---------+---------+---------+---------+---------+---------+
If we iterate through the array Acc_Prof, we can find out the maximum profit to be 17! The
pseudo-code:
Procedure WeightedJobScheduling(Job)
sort Job according to finish time in non-decreasing order
for i -> 2 to n
for j -> 1 to i-1
if Job[j].finish_time <= Job[i].start_time
if Acc_Prof[j] + Profit[i] > Acc_Prof[i]
Acc_Prof[i] = Acc_Prof[j] + Profit[i]
endif
endif
endfor
endfor
maxProfit = 0
for i -> 1 to n
if maxProfit < Acc_Prof[i]
maxProfit = Acc_Prof[i]
return maxProfit
The complexity of populating the Acc_Prof array is O(n2). The array traversal takes O(n). So the
total complexity of this algorithm is O(n2).
Now, If we want to find out which jobs were performed to get the maximum profit, we need to
traverse the array in reverse order and if the Acc_Prof matches the maxProfit, we will push the
https://riptutorial.com/ 127
name of the job in a stack and subtract Profit of that job from maxProfit. We will do this until our
maxProfit > 0 or we reach the beginning point of the Acc_Prof array. The pseudo-code will look
like:
One thing to remember, if there are multiple job schedules that can give us maximum profit, we
can only find one job schedule via this procedure.
Edit Distance
The problem statement is like if we are given two string str1 and str2 then how many minimum
number of operations can be performed on the str1 that it gets converted to str2.
Implementation in Java
https://riptutorial.com/ 128
Output
If we are given with the two strings we have to find the longest common sub-sequence present in
both of them.
Example
Implementation in Java
//Recursive function
public int lcs(String str1, String str2, int m, int n){
if(m==0 || n==0)
return 0;
if(str1.charAt(m-1) == str2.charAt(n-1))
return 1 + lcs(str1, str2, m-1, n-1);
else
return Math.max(lcs(str1, str2, m-1, n), lcs(str1, str2, m, n-1));
}
//Iterative function
public int lcs2(String str1, String str2){
int lcs[][] = new int[str1.length()+1][str2.length()+1];
for(int i=0;i<=str1.length();i++){
for(int j=0;j<=str2.length();j++){
if(i==0 || j== 0){
lcs[i][j] = 0;
}
else if(str1.charAt(i-1) == str2.charAt(j-1)){
lcs[i][j] = 1 + lcs[i-1][j-1];
}else{
lcs[i][j] = Math.max(lcs[i-1][j], lcs[i][j-1]);
}
}
}
return lcs[str1.length()][str2.length()];
}
https://riptutorial.com/ 129
}
Output
Fibonacci Number
Bottom up approach for printing the nth Fibonacci number using Dynamic Programming.
Recursive Tree
fib(5)
/ \
fib(4) fib(3)
/ \ / \
fib(3) fib(2) fib(2) fib(1)
/ \ / \ / \
fib(2) fib(1) fib(1) fib(0) fib(1) fib(0)
/ \
fib(1) fib(0)
Overlapping Sub-problems
Here fib(0),fib(1) and fib(3) are the overlapping sub-problems.fib(0) is getting repeated 3 times,
fib(1) is getting repeated 5 times and fib(3) is getting repeated 2 times.
Implementation
Time Complexity
O(n)
Given 2 string str1 and str2 we have to find the length of the longest common substring between
them.
Examples
https://riptutorial.com/ 130
Input : X = "zxabcdezy", y = "yzabcdezx" Output : 6
Implementation in Java
Time Complexity
O(m*n)
https://riptutorial.com/ 131
Chapter 23: Dynamic Time Warping
Examples
Introduction To Dynamic Time Warping
Dynamic Time Warping(DTW) is an algorithm for measuring similarity between two temporal
sequences which may vary in speed. For instance, similarities in walking could be detected using
DTW, even if one person was walking faster than the other, or if there were accelerations and
decelerations during the course of an observation. It can be used to match a sample voice
command with others command, even if the person talks faster or slower than the prerecorded
sample voice. DTW can be applied to temporal sequences of video, audio and graphics data-
indeed, any data which can be turned into a linear sequence can be analyzed with DTW.
In general, DTW is a method that calculates an optimal match between two given sequences with
certain restrictions. But let's stick to the simpler points here. Let's say, we have two voice
sequences Sample and Test, and we want to check if these two sequences match or not. Here
voice sequence refers to the converted digital signal of your voice. It might be the amplitude or
frequency of your voice that denotes the words you say. Let's assume:
Sample = {1, 2, 3, 5, 5, 5, 6}
Test = {1, 1, 2, 2, 3, 5}
We want to find out the optimal match between these two sequences.
At first, we define the distance between two points, d(x, y) where x and y represent the two points.
Let,
Let's create a 2D matrix Table using these two sequences. We'll calculate the distances between
each point of Sample with every points of Test and find the optimal match between them.
+------+------+------+------+------+------+------+------+
| | 0 | 1 | 1 | 2 | 2 | 3 | 5 |
+------+------+------+------+------+------+------+------+
| 0 | | | | | | | |
+------+------+------+------+------+------+------+------+
| 1 | | | | | | | |
+------+------+------+------+------+------+------+------+
| 2 | | | | | | | |
+------+------+------+------+------+------+------+------+
| 3 | | | | | | | |
+------+------+------+------+------+------+------+------+
| 5 | | | | | | | |
+------+------+------+------+------+------+------+------+
| 5 | | | | | | | |
+------+------+------+------+------+------+------+------+
| 5 | | | | | | | |
https://riptutorial.com/ 132
+------+------+------+------+------+------+------+------+
| 6 | | | | | | | |
+------+------+------+------+------+------+------+------+
Here, Table[i][j] represents the optimal distance between two sequences if we consider the
sequence up to Sample[i] and Test[j], considering all the optimal distances we observed before.
For the first row, if we take no values from Sample, the distance between this and Test will be
infinity. So we put infinity on the first row. Same goes for the first column. If we take no values from
Test, the distance between this one and Sample will also be infinity. And the distance between 0
and 0 will simply be 0. We get,
+------+------+------+------+------+------+------+------+
| | 0 | 1 | 1 | 2 | 2 | 3 | 5 |
+------+------+------+------+------+------+------+------+
| 0 | 0 | inf | inf | inf | inf | inf | inf |
+------+------+------+------+------+------+------+------+
| 1 | inf | | | | | | |
+------+------+------+------+------+------+------+------+
| 2 | inf | | | | | | |
+------+------+------+------+------+------+------+------+
| 3 | inf | | | | | | |
+------+------+------+------+------+------+------+------+
| 5 | inf | | | | | | |
+------+------+------+------+------+------+------+------+
| 5 | inf | | | | | | |
+------+------+------+------+------+------+------+------+
| 5 | inf | | | | | | |
+------+------+------+------+------+------+------+------+
| 6 | inf | | | | | | |
+------+------+------+------+------+------+------+------+
Now for each step, we'll consider the distance between each points in concern and add it with the
minimum distance we found so far. This will give us the optimal distance of two sequences up to
that position. Our formula will be,
For the first one, d(1, 1) = 0, Table[0][0] represents the minimum. So the value of Table[1][1] will
be 0 + 0 = 0. For the second one, d(1, 2) = 0. Table[1][1] represents the minimum. The value will
be: Table[1][2] = 0 + 0 = 0. If we continue this way, after finishing, the table will look like:
+------+------+------+------+------+------+------+------+
| | 0 | 1 | 1 | 2 | 2 | 3 | 5 |
+------+------+------+------+------+------+------+------+
| 0 | 0 | inf | inf | inf | inf | inf | inf |
+------+------+------+------+------+------+------+------+
| 1 | inf | 0 | 0 | 1 | 2 | 4 | 8 |
+------+------+------+------+------+------+------+------+
| 2 | inf | 1 | 1 | 0 | 0 | 1 | 4 |
+------+------+------+------+------+------+------+------+
| 3 | inf | 3 | 3 | 1 | 1 | 0 | 2 |
+------+------+------+------+------+------+------+------+
| 5 | inf | 7 | 7 | 4 | 4 | 2 | 0 |
https://riptutorial.com/ 133
+------+------+------+------+------+------+------+------+
| 5 | inf | 11 | 11 | 7 | 7 | 4 | 0 |
+------+------+------+------+------+------+------+------+
| 5 | inf | 15 | 15 | 10 | 10 | 6 | 0 |
+------+------+------+------+------+------+------+------+
| 6 | inf | 20 | 20 | 14 | 14 | 9 | 1 |
+------+------+------+------+------+------+------+------+
The value at Table[7][6] represents the maximum distance between these two given sequences.
Here 1 represents the maximum distance between Sample and Test is 1.
Now if we backtrack from the last point, all the way back towards the starting (0, 0) point, we get a
long line that moves horizontally, vertically and diagonally. Our backtracking procedure will be:
We'll continue this till we reach (0, 0). Each move has its own meaning:
• A horizontal move represents deletion. That means our Test sequence accelerated during
this interval.
• A vertical move represents insertion. That means out Test sequence decelerated during this
interval.
• A diagonal move represents match. During this period Test and Sample were same.
https://riptutorial.com/ 134
Procedure DTW(Sample, Test):
n := Sample.length
m := Test.length
Create Table[n + 1][m + 1]
for i from 1 to n
Table[i][0] := infinity
end for
for i from 1 to m
Table[0][i] := infinity
end for
Table[0][0] := 0
for i from 1 to n
for j from 1 to m
Table[i][j] := d(Sample[i], Test[j])
+ minimum(Table[i-1][j-1], //match
Table[i][j-1], //insertion
Table[i-1][j]) //deletion
end for
end for
Return Table[n + 1][m + 1]
We can also add a locality constraint. That is, we require that if Sample[i] is matched with Test[j],
then |i - j| is no larger than w, a window parameter.
Complexity:
The complexity of computing DTW is O(m * n) where m and n represent the length of each
sequence. Faster techniques for computing DTW include PrunedDTW, SparseDTW and
FastDTW.
Applications:
https://riptutorial.com/ 135
Chapter 24: Edit Distance Dynamic Algorithm
Introduction
Examples
Minimum Edits required to convert string 1 to string 2
The problem statement is like if we are given two string str1 and str2 then how many minimum
number of operations can be performed on the str1 that it gets converted to str2.The Operations
can be:
1. Insert
2. Remove
3. Replace
For Example
To solve this problem we will use a 2D array dp[n+1][m+1] where n is the length of the first string
and m is the length of the second string. For our example, if str1 is azcef and str2 is abcdef then
our array will be dp[6][7]and our final answer will be stored at dp[5][6].
For dp[1][1] we have to check what can we do to convert a into a.It will be 0.For dp[1][2] we have
to check what can we do to convert a into ab.It will be 1 because we have to insert b.So after 1st
iteration our array will look like
https://riptutorial.com/ 136
(a) (b) (c) (d) (e) (f)
+---+---+---+---+---+---+---+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 |
+---+---+---+---+---+---+---+
(a)| 1 | 0 | 1 | 2 | 3 | 4 | 5 |
+---+---+---+---+---+---+---+
(z)| 2 | | | | | | |
+---+---+---+---+---+---+---+
(c)| 3 | | | | | | |
+---+---+---+---+---+---+---+
(e)| 4 | | | | | | |
+---+---+---+---+---+---+---+
(f)| 5 | | | | | | |
+---+---+---+---+---+---+---+
For iteration 2
For dp[2][1] we have to check that to convert az to a we need to remove z, hence dp[2][1] will be
1.Similary for dp[2][2] we need to replace z with b, hence dp[2][2] will be 1.So after 2nd iteration
our dp[] array will look like.
https://riptutorial.com/ 137
+---+---+---+---+---+---+---+
(f)| 5 | 4 | 4 | 2 | 3 | 3 | 3 |
+---+---+---+---+---+---+---+
Implementation in Java
Time Complexity
O(n^2)
https://riptutorial.com/ 138
Chapter 25: Equation Solving
Examples
Linear Equation
1. Direct Methods: Common characteristics of direct methods are that they transform the
original equation into equivalent equations that can be solved more easily, means we get
solve directly from an equation.
2. Iterative Method: Iterative or Indirect Methods, start with a guess of the solution and then
repeatedly refine the solution until a certain convergence criterion is reached. Iterative
methods are generally less efficient than direct methods because large number of operations
required. Example- Jacobi's Iteration Method, Gauss-Seidal Iteration Method.
Implementation in C-
int i, j;
while(!rootFound){
for(i=0; i<n; i++){ //calculation
Nx[i]=b[i];
rootFound=1; //verification
for(i=0; i<n; i++){
if(!( (Nx[i]-x[i])/x[i] > -0.000001 && (Nx[i]-x[i])/x[i] < 0.000001 )){
rootFound=0;
break;
}
}
return ;
}
https://riptutorial.com/ 139
int rootFound=0; //flag
int i, j;
for(i=0; i<n; i++){ //initialization
Nx[i]=x[i];
}
while(!rootFound){
for(i=0; i<n; i++){ //calculation
Nx[i]=b[i];
rootFound=1; //verification
for(i=0; i<n; i++){
if(!( (Nx[i]-x[i])/x[i] > -0.000001 && (Nx[i]-x[i])/x[i] < 0.000001 )){
rootFound=0;
break;
}
}
return ;
}
return ;
}
int main(){
//equation initialization
int n=3; //number of variables
//assign values
a[0][0]=8; a[0][1]=2; a[0][2]=-2; b[0]=8; //8x₁+2x₂-2x₃+8=0
a[1][0]=1; a[1][1]=-8; a[1][2]=3; b[1]=-4; //x₁-8x₂+3x₃-4=0
a[2][0]=2; a[2][1]=1; a[2][2]=9; b[2]=12; //2x₁+x₂+9x₃+12=0
int i;
https://riptutorial.com/ 140
for(i=0; i<n; i++){ //initialization
x[i]=0;
}
JacobisMethod(n, x, b, a);
print(n, x);
return 0;
}
Non-Linear Equation
An equation of the type f(x)=0 is either algebraic or transcendental. These types of equations can
be solved by using two types of methods-
1. Direct Method: This method gives the exact value of all the roots directly in a finite number
of steps.
2. Indirect or Iterative Method: Iterative methods are best suited for computer programs to
solve an equation. It is based on the concept of successive approximation. In Iterative
Method there are two ways to solve an equation-
• Bracketing Method: We take two initial points where the root lies in between them.
Example- Bisection Method, False Position Method.
• Open End Method: We take one or two initial values where the root may be any-
where. Example- Newton-Raphson Method, Successive Approximation Method,
Secant Method.
Implementation in C-
/**
* Takes two initial values and shortens the distance by both side.
**/
double BisectionMethod(){
double root=0;
int loopCounter=0;
if(f(a)*f(b) < 0){
while(1){
https://riptutorial.com/ 141
loopCounter++;
c=(a+b)/2;
}
}
printf("It took %d loops.\n", loopCounter);
return root;
}
/**
* Takes two initial values and shortens the distance by single side.
**/
double FalsePosition(){
double root=0;
int loopCounter=0;
if(f(a)*f(b) < 0){
while(1){
loopCounter++;
return root;
}
/**
* Uses one initial value and gradually takes that value near to the real one.
**/
double NewtonRaphson(){
double root=0;
https://riptutorial.com/ 142
double x1=1;
double x2=0;
int loopCounter=0;
while(1){
loopCounter++;
x2 = x1 - (f(x1)/f2(x1));
/*/printf("%lf \t %lf \n", x2, f(x2));/**////test
x1=x2;
}
printf("It took %d loops.\n", loopCounter);
return root;
}
/**
* Uses one initial value and gradually takes that value near to the real one.
**/
double FixedPoint(){
double root=0;
double x=1;
int loopCounter=0;
while(1){
loopCounter++;
x=g(x);
}
printf("It took %d loops.\n", loopCounter);
return root;
}
/**
* uses two initial values & both value approaches to the root.
**/
double Secant(){
double root=0;
double x0=1;
double x1=2;
double x2=0;
int loopCounter=0;
while(1){
loopCounter++;
https://riptutorial.com/ 143
/*/printf("%lf \t %lf \t %lf \n", x0, x1, f(x1));/**////test
x2 = ((x0*f(x1))-(x1*f(x0))) / (f(x1)-f(x0));
x0=x1;
x1=x2;
}
printf("It took %d loops.\n", loopCounter);
return root;
}
int main(){
double root;
root = BisectionMethod();
printf("Using Bisection Method the root is: %lf \n\n", root);
root = FalsePosition();
printf("Using False Position Method the root is: %lf \n\n", root);
root = NewtonRaphson();
printf("Using Newton-Raphson Method the root is: %lf \n\n", root);
root = FixedPoint();
printf("Using Fixed Point Method the root is: %lf \n\n", root);
root = Secant();
printf("Using Secant Method the root is: %lf \n\n", root);
return 0;
}
https://riptutorial.com/ 144
Chapter 26: Fast Fourier Transform
Introduction
The Real and Complex form of DFT (Discrete Fourier Transforms) can be used to perform
frequency analysis or synthesis for any discrete and periodic signals. The FFT (Fast Fourier T
ransform) is an implementation of the DFT which may be performed quickly on modern CPUs.
Examples
Radix 2 FFT
The simplest and perhaps best-known method for computing the FFT is the Radix-2 Decimation in
Time algorithm. The Radix-2 FFT works by decomposing an N point time domain signal into N
time domain signals each composed of a single point
https://riptutorial.com/ 145
.
Signal decomposition, or ‘decimation in time’ is achieved by bit reversing the indices for the array
of time domain data. Thus, for a sixteen-point signal, sample 1 (Binary 0001) is swapped with
sample 8 (1000), sample 2 (0010) is swapped with 4 (0100) and so on. Sample swapping using
the bit reverse technique can be achieved simply in software, but limits the use of the Radix 2 FFT
to signals of length N = 2^M.
https://riptutorial.com/ 146
The value of a 1-point signal in the time domain is equal to its value in the frequency domain, thus
this array of decomposed single time-domain points requires no transformation to become an
array of frequency domain points. The N single points; however, need to be reconstructed into one
N-point frequency spectra. Optimal reconstruction of the complete frequency spectrum is
performed using butterfly calculations. Each reconstruction stage in the Radix-2 FFT performs a
number of two point butterflies, using a similar set of exponential weighting functions, Wn^R.
The FFT removes redundant calculations in the Discrete Fourier Transform by exploiting the
periodicity of Wn^R. Spectral reconstruction is completed in log2(N) stages of butterfly calculations
giving X[K]; the real and imaginary frequency domain data in rectangular form. To convert to
magnitude and phase (polar coordinates) requires finding the absolute value, √(Re2 + Im2), and
argument, tan-1(Im/Re).
https://riptutorial.com/ 147
The complete butterfly flow diagram for an eight point Radix 2 FFT is shown below. Note the input
signals have previously been reordered according to the decimation in time procedure outlined
previously.
https://riptutorial.com/ 148
https://riptutorial.com/ 149
2. Perform the forward FFT on the conjugated frequency domain data.
3. Divide each output of the result of this FFT by N to give the true time domain value.
4. Find the complex conjugate of the output by inverting the imaginary component of the time
domain data for all instances of n.
Note: both frequency and time domain data are complex variables. Typically the imaginary
component of the time domain signal following an inverse FFT is either zero, or ignored as
rounding error. Increasing the precision of variables from 32-bit float to 64-bit double, or 128-bit
long double significantly reduces rounding errors produced by several consecutive FFT
operations.
#include <math.h>
int i;
complex* x;
for ( i = 0, x = pX; i < N; i++, x++){
x->Re *= NN; // Divide time domain by N for correct amplitude scaling
x->Im *= -1; // Change the sign of ImX
}
https://riptutorial.com/ 150
}
https://riptutorial.com/ 151
Chapter 27: Floyd-Warshall Algorithm
Examples
All Pair Shortest Path Algorithm
Floyd-Warshall's algorithm is for finding shortest paths in a weighted graph with positive or
negative edge weights. A single execution of the algorithm will find the lengths (summed weights)
of the shortest paths between all pair of vertices. With a little variation, it can print the shortest path
and can detect negative cycles in a graph. Floyd-Warshall is a Dynamic-Programming algorithm.
Let's look at an example. We're going to apply Floyd-Warshall's algorithm on this graph:
First thing we do is, we take two 2D matrices. These are adjacency matrices. The size of the
matrices is going to be the total number of vertices. For our graph, we will take 4 * 4 matrices. The
Distance Matrix is going to store the minimum distance found so far between two vertices. At first,
for the edges, if there is an edge between u-v and the distance/weight is w, we'll store:
distance[u][v] = w. For all the edges that doesn't exist, we're gonna put infinity. The Path Matrix is
for regenerating minimum distance path between two vertices. So initially, if there is a path
between u and v, we're going to put path[u][v] = u. This means the best way to come to vertex-v
from vertex-u is to use the edge that connects v with u. If there is no path between two vertices,
we're going to put N there indicating there is no path available now. The two tables for our graph
will look like:
+-----+-----+-----+-----+-----+ +-----+-----+-----+-----+-----+
| | 1 | 2 | 3 | 4 | | | 1 | 2 | 3 | 4 |
+-----+-----+-----+-----+-----+ +-----+-----+-----+-----+-----+
| 1 | 0 | 3 | 6 | 15 | | 1 | N | 1 | 1 | 1 |
+-----+-----+-----+-----+-----+ +-----+-----+-----+-----+-----+
| 2 | inf | 0 | -2 | inf | | 2 | N | N | 2 | N |
+-----+-----+-----+-----+-----+ +-----+-----+-----+-----+-----+
| 3 | inf | inf | 0 | 2 | | 3 | N | N | N | 3 |
+-----+-----+-----+-----+-----+ +-----+-----+-----+-----+-----+
| 4 | 1 | inf | inf | 0 | | 4 | 4 | N | N | N |
+-----+-----+-----+-----+-----+ +-----+-----+-----+-----+-----+
distance path
https://riptutorial.com/ 152
Since there is no loop, the diagonals are set N. And the distance from the vertex itself is 0.
To apply Floyd-Warshall algorithm, we're going to select a middle vertex k. Then for each vertex i,
we're going to check if we can go from i to k and then k to j, where j is another vertex and
minimize the cost of going from i to j. If the current distance[i][j] is greater than distance[i][k] +
distance[k][j], we're going to put distance[i][j] equals to the summation of those two distances.
And the path[i][j] will be set to path[k][j], as it is better to go from i to k, and then k to j. All the
vertices will be selected as k. We'll have 3 nested loops: for k going from 1 to 4, i going from 1 to 4
and j going from 1 to 4. We're going check:
So what we're basically checking is, for every pair of vertices, do we get a shorter distance by
going through another vertex? The total number of operations for our graph will be 4 * 4 * 4 = 64.
That means we're going to do this check 64 times. Let's look at a few of them:
+-----+-----+-----+-----+-----+ +-----+-----+-----+-----+-----+
| | 1 | 2 | 3 | 4 | | | 1 | 2 | 3 | 4 |
+-----+-----+-----+-----+-----+ +-----+-----+-----+-----+-----+
| 1 | 0 | 3 | 1 | 3 | | 1 | N | 1 | 2 | 3 |
+-----+-----+-----+-----+-----+ +-----+-----+-----+-----+-----+
| 2 | 1 | 0 | -2 | 0 | | 2 | 4 | N | 2 | 3 |
+-----+-----+-----+-----+-----+ +-----+-----+-----+-----+-----+
| 3 | 3 | 6 | 0 | 2 | | 3 | 4 | 1 | N | 3 |
+-----+-----+-----+-----+-----+ +-----+-----+-----+-----+-----+
| 4 | 1 | 4 | 2 | 0 | | 4 | 4 | 1 | 2 | N |
+-----+-----+-----+-----+-----+ +-----+-----+-----+-----+-----+
distance path
This is our shortest distance matrix. For example, the shortest distance from 1 to 4 is 3 and the
shortest distance between 4 to 3 is 2. Our pseudo-code will be:
Procedure Floyd-Warshall(Graph):
for k from 1 to V // V denotes the number of vertex
for i from 1 to V
for j from 1 to V
if distance[i][j] > distance[i][k] + distance[k][j]
distance[i][j] := distance[i][k] + distance[k][j]
path[i][j] := path[k][j]
end if
end for
https://riptutorial.com/ 153
end for
end for
To print the path, we'll check the Path matrix. To print the path from u to v, we'll start from
path[u][v]. We'll set keep changing v = path[u][v] until we find path[u][v] = u and push every
values of path[u][v] in a stack. After finding u, we'll print u and start popping items from the stack
and print them. This works because the path matrix stores the value of the vertex which shares
the shortest path to v from any other node. The pseudo-code will be:
To find out if there is a negative edge cycle, we'll need to check the main diagonal of distance
matrix. If any value on the diagonal is negative, that means there is a negative cycle in the graph.
Complexity:
The complexity of Floyd-Warshall algorithm is O(V³) and the space complexity is: O(V²).
https://riptutorial.com/ 154
Chapter 28: Graph
Introduction
A graph is a collection of points and lines connecting some (possibly empty) subset of them. The
points of a graph are called graph vertices, "nodes" or simply "points." Similarly, the lines
connecting the vertices of a graph are called graph edges, "arcs" or "lines."
A graph G can be defined as a pair (V,E), where V is a set of vertices, and E is a set of edges
between the vertices E ⊆ {(u,v) | u, v ∈ V}.
Remarks
Graphs are a mathematical structure that model sets of objects that may or may not be connected
with members from sets of edges or links.
• A set of vertices.
• A set of edges that connect pairs of vertices.
Examples
Topological Sort
A topological ordering, or a topological sort, orders the vertices in a directed acyclic graph on a
line, i.e. in a list, such that all directed edges go from left to right. Such an ordering cannot exist if
the graph contains a directed cycle because there is no way that you can keep going right on a
line and still return back to where you started from.
Formally, in a graph G = (V, E), then a linear ordering of all its vertices is such that if G contains an
edge (u, v) ∈ Efrom vertex u to vertex v then u precedes v in the ordering.
It is important to note that each DAG has at least one topological sort.
There are known algorithms for constructing a topological ordering of any DAG in linear time, one
example is:
https://riptutorial.com/ 155
3. the linked list of vertices, as it is now sorted.
A topological sort can be performed in (V + E) time, since the depth-first search algorithm takes
(V + E) time and it takes Ω(1) (constant time) to insert each of |V| vertices into the front of a linked
list.
Many applications use directed acyclic graphs to indicate precedences among events. We use
topological sorting so that we get an ordering to process each vertex before any of its successors.
Vertices in a graph may represent tasks to be performed and the edges may represent constraints
that one task must be performed before another; a topological ordering is a valid sequence to
perform the tasks set of tasks described in V.
Let our graph be called dag (since it is a directed acyclic graph), and let it contain 5 vertices:
A <- dag.add_vertex(Task(4));
B <- dag.add_vertex(Task(5));
C <- dag.add_vertex(Task(3));
D <- dag.add_vertex(Task(2));
E <- dag.add_vertex(Task(7));
where we connect the vertices with directed edges such that the graph is acyclic,
// A ---> C ----+
// | | |
// v v v
// B ---> D --> E
dag.add_edge(A, B, Cooldown(2));
dag.add_edge(A, C, Cooldown(2));
dag.add_edge(B, D, Cooldown(1));
dag.add_edge(C, D, Cooldown(1));
dag.add_edge(C, E, Cooldown(1));
dag.add_edge(D, E, Cooldown(3));
Thorup's algorithm
Thorup's algorithm for single source shortest path for undirected graph has the time complexity
https://riptutorial.com/ 156
O(m), lower than Dijkstra.
Basic ideas are the following. (Sorry, I didn't try implementing it yet, so I might miss some minor
details. And the original paper is paywalled so I tried to reconstruct it from other sources
referencing it. Please remove this comment if you could verify.)
• There are ways to find the spanning tree in O(m) (not described here). You need to "grow"
the spanning tree from the shortest edge to the longest, and it would be a forest with several
connected components before fully grown.
• Select an integer b (b>=2) and only consider the spanning forests with length limit b^k.
Merge the components which are exactly the same but with different k, and call the minimum
k the level of the component. Then logically make components into a tree. u is the parent of
v iff u is the smallest component distinct from v that fully contains v. The root is the whole
graph and the leaves are single vertices in the original graph (with the level of negative
infinity). The tree still has only O(n) nodes.
• Maintain the distance of each component to the source (like in Dijkstra's algorithm). The
distance of a component with more than one vertices is the minimum distance of its
unexpanded children. Set the distance of the source vertex to 0 and update the ancestors
accordingly.
• Consider the distances in base b. When visiting a node in level k the first time, put its
children into buckets shared by all nodes of level k (as in bucket sort, replacing the heap in
Dijkstra's algorithm) by the digit k and higher of its distance. Each time visiting a node,
consider only its first b buckets, visit and remove each of them, update the distance of the
current node, and relink the current node to its own parent using the new distance and wait
for the next visit for the following buckets.
• When a leaf is visited, the current distance is the final distance of the vertex. Expand all
edges from it in the original graph and update the distances accordingly.
• Visit the root node (whole graph) repeatedly until the destination is reached.
It is based on the fact that, there isn't an edge with length less than l between two connected
components of the spanning forest with length limitation l, so, starting at distance x, you could
focus only on one connected component until you reach the distance x + l. You'll visit some
vertices before vertices with shorter distance are all visited, but that doesn't matter because it is
known there won't be a shorter path to here from those vertices. Other parts work like the bucket
sort / MSD radix sort, and of course, it requires the O(m) spanning tree.
A cycle in a directed graph exists if there's a back edge discovered during a DFS. A back edge is
an edge from a node to itself or one of the ancestors in a DFS tree. For a disconnected graph, we
get a DFS forest, so you have to iterate through all vertices in the graph to find disjoint DFS trees.
C++ implementation:
#include <iostream>
#include <list>
https://riptutorial.com/ 157
#define NUM_V 4
for(int i = 0;i<V;i++)
visited[i]=false, recStack[i]=false; //initialize all vertices as not visited and not
recursed
for(int u = 0; u < V; u++) //Iteratively checks if every vertices have been visited
{ if(visited[u]==false)
{ if(helper(graph, u, visited, recStack)) //checks if the DFS tree from the vertex
contains a cycle
return true;
}
}
return false;
}
/*
Driver function
*/
int main()
{
list<int>* graph = new list<int>[NUM_V];
graph[0].push_back(1);
graph[0].push_back(2);
graph[1].push_back(2);
graph[2].push_back(0);
graph[2].push_back(3);
graph[3].push_back(3);
bool res = isCyclic(graph, NUM_V);
cout<<res<<endl;
}
Result: As shown below, there are three back edges in the graph. One between vertex 0 and 2;
https://riptutorial.com/ 158
between vertice 0, 1, and 2; and vertex 3. Time complexity of search is O(V+E) where V is the
number of vertices and E is the number of edges.
Graph Theory is the study of graphs, which are mathematical structures used to model pairwise
relations between objects.
Did you know, almost all the problems of planet Earth can be converted into problems of Roads
and Cities, and solved? Graph Theory was invented many years ago, even before the invention of
computer. Leonhard Euler wrote a paper on the Seven Bridges of Königsberg which is regarded
as the first paper of Graph Theory. Since then, people have come to realize that if we can convert
any problem to this City-Road problem, we can solve it easily by Graph Theory.
Graph Theory has many applications.One of the most common application is to find the shortest
distance between one city to another. We all know that to reach your PC, this web-page had to
https://riptutorial.com/ 159
travel many routers from the server. Graph Theory helps it to find out the routers that needed to be
crossed. During war, which street needs to be bombarded to disconnect the capital city from
others, that too can be found out using Graph Theory.
Graph:
Let's say, we have 6 cities. We mark them as 1, 2, 3, 4, 5, 6. Now we connect the cities that have
roads between each other.
This is a simple graph where some cities are shown with the roads that are connecting them. In
Graph Theory, we call each of these cities Node or Vertex and the roads are called Edge. Graph
is simply a connection of these nodes and edges.
A node can represent a lot of things. In some graphs, nodes represent cities, some represent
airports, some represent a square in a chessboard. Edge represents the relation between each
nodes. That relation can be the time to go from one airport to another, the moves of a knight from
one square to all the other squares etc.
https://riptutorial.com/ 160
Path of Knight in a Chessboard
In simple words, a Node represents any object and Edge represents the relation between two
objects.
Adjacent Node:
If a node A shares an edge with node B, then B is considered to be adjacent to A. In other words,
if two nodes are directly connected, they are called adjacent nodes. One node can have multiple
adjacent nodes.
In directed graphs, the edges have direction signs on one side, that means the edges are
Unidirectional. On the other hand, the edges of undirected graphs have direction signs on both
sides, that means they are Bidirectional. Usually undirected graphs are represented with no signs
on the either sides of the edges.
Let's assume there is a party going on. The people in the party are represented by nodes and
there is an edge between two people if they shake hands. Then this graph is undirected because
any person A shake hands with person B if and only if B also shakes hands with A. In contrast, if
the edges from a person A to another person B corresponds to A's admiring B, then this graph is
directed, because admiration is not necessarily reciprocated. The former type of graph is called an
undirected graph and the edges are called undirected edges while the latter type of graph is called
https://riptutorial.com/ 161
a directed graph and the edges are called directed edges.
A weighted graph is a graph in which a number (the weight) is assigned to each edge. Such
weights might represent for example costs, lengths or capacities, depending on the problem at
hand.
An unweighted graph is simply the opposite. We assume that, the weight of all the edges are
same (presumably 1).
Path:
A path represents a way of going from one node to another. It consists of sequence of edges.
There can be multiple paths between two nodes.
https://riptutorial.com/ 162
In the example above, there are two paths from A to D. A->B, B->C, C->D is one path. The cost of
this path is 3 + 4 + 2 = 9. Again, there's another path A->D. The cost of this path is 10. The path
that costs the lowest is called shortest path.
Degree:
The degree of a vertex is the number of edges that are connected to it. If there's any edge that
connects to the vertex at both ends (a loop) is counted twice.
https://riptutorial.com/ 163
Some Algorithms Related to Graph Theory
• Bellman–Ford algorithm
• Dijkstra's algorithm
• Ford–Fulkerson algorithm
• Kruskal's algorithm
• Nearest neighbour algorithm
• Prim's algorithm
• Depth-first search
• Breadth-first search
• Adjacency Matrix
• Adjacency List
An adjacency matrix is a square matrix used to represent a finite graph. The elements of the
matrix indicate whether pairs of vertices are adjacent or not in the graph.
Adjacent means 'next to or adjoining something else' or to be beside something. For example,
your neighbors are adjacent to you. In graph theory, if we can go to node B from node A, we can
say that node B is adjacent to node A. Now we will learn about how to store which nodes are
adjacent to which one via Adjacency Matrix. This means, we will represent which nodes share
edge between them. Here matrix means 2D array.
https://riptutorial.com/ 164
Here you can see a table beside the graph, this is our adjacency matrix. Here Matrix[i][j] = 1
represents there is an edge between i and j. If there's no edge, we simply put Matrix[i][j] = 0.
These edges can be weighted, like it can represent the distance between two cities. Then we'll put
the value in Matrix[i][j] instead of putting 1.
The graph described above is Bidirectional or Undirected, that means, if we can go to node 1 from
node 2, we can also go to node 2 from node 1. If the graph was Directed, then there would've
been arrow sign on one side of the graph. Even then, we could represent it using adjacency
matrix.
We represent the nodes that don't share edge by infinity. One thing to be noticed is that, if the
graph is undirected, the matrix becomes symmetric.
https://riptutorial.com/ 165
for i from 1 to N
for j from 1 to N
Take input -> Matrix[i][j]
endfor
endfor
Memory is a huge problem. No matter how many edges are there, we will always need N * N sized
matrix where N is the number of nodes. If there are 10000 nodes, the matrix size will be 4 * 10000
* 10000 around 381 megabytes. This is a huge waste of memory if we consider graphs that have a
few edges.
Suppose we want to find out to which node we can go from a node u. We'll need to check the
whole row of u, which costs a lot of time.
The only benefit is that, we can easily find the connection between u-v nodes, and their cost using
Adjacency Matrix.
import java.util.Scanner;
public Represent_Graph_Adjacency_Matrix(int v)
{
vertices = v;
adjacency_matrix = new int[vertices + 1][vertices + 1];
}
https://riptutorial.com/ 166
}
}
}
catch (Exception E)
{
System.out.println("Somthing went wrong");
}
sc.close();
https://riptutorial.com/ 167
}
}
Running the code: Save the file and compile using javac Represent_Graph_Adjacency_Matrix.java
Example:
$ java Represent_Graph_Adjacency_Matrix
Enter the number of vertices:
4
Enter the number of edges:
6
Enter the edges: <to> <from>
1 1
3 4
2 3
1 4
2 4
1 2
The adjacency matrix for the given graph is:
1 2 3 4
1 1 1 0 1
2 0 0 1 1
3 0 0 0 1
4 0 0 0 0
Adjacency list is a collection of unordered lists used to represent a finite graph. Each list describes
the set of neighbors of a vertex in a graph. It takes less memory to store graphs.
https://riptutorial.com/ 168
This is called adjacency list. It shows which nodes are connected to which nodes. We can store
this information using a 2D array. But will cost us the same memory as Adjacency Matrix. Instead
we are going to use dynamically allocated memory to store this one.
Many languages support Vector or List which we can use to store adjacency list. For these, we
don't need to specify the size of the List. We only need to specify the maximum number of nodes.
Since this one is an undirected graph, it there is an edge from x to y, there is also an edge from y
to x. If it was a directed graph, we'd omit the second one. For weighted graphs, we need to store
the cost too. We'll create another vector or list named cost[] to store these. The pseudo-code:
https://riptutorial.com/ 169
From this one, we can easily find out the total number of nodes connected to any node, and what
these nodes are. It takes less time than Adjacency Matrix. But if we needed to find out if there's an
edge between u and v, it'd have been easier if we kept an adjacency matrix.
https://riptutorial.com/ 170
Chapter 29: Graph Traversals
Examples
Depth First Search traversal function
The function takes the argument of the current node index, adjacency list (stored in vector of
vectors in this example), and vector of boolean to keep track of which node has been visited.
https://riptutorial.com/ 171
Chapter 30: Greedy Algorithms
Remarks
A greedy algorithm is an algorithm in which in each step we choose the most beneficial option in
every step without looking into the future. The choice depends only on current profit.
Greedy approach is usually a good approach when each profit can be picked up in every step, so
no choice blocks another one.
Examples
Continuous knapsack problem
Given items as (value, weight) we need to place them in a knapsack (container) of a capacity k.
Note! We can break items to maximize value!
Example input:
Expected output:
maximumValueOfItemsInK = 20;
Algorithm:
Huffman Coding
Huffman code is a particular type of optimal prefix code that is commonly used for lossless data
compression. It compresses data very effectively saving from 20% to 90% memory, depending on
https://riptutorial.com/ 172
the characteristics of the data being compressed. We consider the data to be a sequence of
characters. Huffman's greedy algorithm uses a table giving how often each character occurs (i.e.,
its frequency) to build up an optimal way of representing each character as a binary string.
Huffman code was proposed by David A. Huffman in 1951.
Suppose we have a 100,000-character data file that we wish to store compactly. We assume that
there are only 6 different characters in that file. The frequency of the characters are given by:
+------------------------+-----+-----+-----+-----+-----+-----+
| Character | a | b | c | d | e | f |
+------------------------+-----+-----+-----+-----+-----+-----+
|Frequency (in thousands)| 45 | 13 | 12 | 16 | 9 | 5 |
+------------------------+-----+-----+-----+-----+-----+-----+
We have many options for how to represent such a file of information. Here, we consider the
problem of designing a Binary Character Code in which each character is represented by a unique
binary string, which we call a codeword.
+------------------------+-----+-----+-----+-----+-----+-----+
| Character | a | b | c | d | e | f |
+------------------------+-----+-----+-----+-----+-----+-----+
| Fixed-length Codeword | 000 | 001 | 010 | 011 | 100 | 101 |
+------------------------+-----+-----+-----+-----+-----+-----+
|Variable-length Codeword| 0 | 101 | 100 | 111 | 1101| 1100|
+------------------------+-----+-----+-----+-----+-----+-----+
If we use a fixed-length code, we need three bits to represent 6 characters. This method requires
300,000 bits to code the entire file. Now the question is, can we do better?
https://riptutorial.com/ 173
A variable-length code can do considerably better than a fixed-length code, by giving frequent
characters short codewords and infrequent characters long codewords. This code requires: (45 X
1 + 13 X 3 + 12 X 3 + 16 X 3 + 9 X 4 + 5 X 4) X 1000 = 224000 bits to represent the file, which
saves approximately 25% of memory.
One thing to remember, we consider here only codes in which no codeword is also a prefix of
some other codeword. These are called prefix codes. For variable-length coding, we code the 3-
character file abc as 0.101.100 = 0101100, where "." denotes the concatenation.
Prefix codes are desirable because they simplify decoding. Since no codeword is a prefix of any
other, the codeword that begins an encoded file is unambiguous. We can simply identify the initial
codeword, translate it back to the original character, and repeat the decoding process on the
remainder of the encoded file. For example, 001011101 parses uniquely as 0.0.101.1101, which
decodes to aabe. In short, all the combinations of binary representations are unique. Say for
example, if one letter is denoted by 110, no other letter will be denoted by 1101 or 1100. This is
because you might face confusion on whether to select 110 or to continue on concatenating the
next bit and select that one.
Compression Technique:
The technique works by creating a binary tree of nodes. These can stored in a regular array, the
size of which depends on the number of symbols, n. A node can either be a leaf node or an
internal node. Initially all nodes are leaf nodes, which contain the symbol itself, its frequency and
optionally, a link to its child nodes. As a convention, bit '0' represents left child and bit '1'
represents right child. Priority queue is used to store the nodes, which provides the node with
lowest frequency when popped. The process is described below:
1. Create a leaf node for each symbol and add it to the priority queue.
2. While there is more than one node in the queue:
1. Remove the two nodes of highest priority from the queue.
2. Create a new internal node with these two nodes as children and with frequency equal
to the sum of the two nodes' frequency.
3. Add the new node to the queue.
3. The remaining node is the root node and the Huffman tree is complete.
https://riptutorial.com/ 174
The pseudo-code looks like:
Although linear-time given sorted input, in general cases of arbitrary input, using this algorithm
requires pre-sorting. Thus, since sorting takes O(nlogn) time in general cases, both methods have
https://riptutorial.com/ 175
same complexity.
Since n here is the number of symbols in the alphabet, which is typically very small number
(compared to the length of the message to be encoded), time complexity is not very important in
the choice of this algorithm.
Decompression Technique:
The process of decompression is simply a matter of translating the stream of prefix codes to
individual byte value, usually by traversing the Huffman tree node by node as each bit is read from
the input stream. Reaching a leaf node necessarily terminates the search for that particular byte
value. The leaf value represents the desired character. Usually the Huffman Tree is constructed
using statistically adjusted data on each compression cycle, thus the reconstruction is fairly
simple. Otherwise, the information to reconstruct the tree must be sent separately. The pseudo-
code:
Greedy Explanation:
Huffman coding looks at the occurrence of each character and stores it as a binary string in an
optimal way. The idea is to assign variable-length codes to input input characters, length of the
assigned codes are based on the frequencies of corresponding characters. We create a binary
tree and operate on it in bottom-up manner so that the least two frequent characters are as far as
possible from the root. In this way, the most frequent character gets the smallest code and the
least frequent character gets the largest code.
References:
• Introduction to Algorithms - Charles E. Leiserson, Clifford Stein, Ronald Rivest, and Thomas
H. Cormen
• Huffman Coding - Wikipedia
• Discrete Mathematics and Its Applications - Kenneth H. Rosen
Change-making problem
Given a money system, is it possible to give an amount of coins and how to find a minimal set of
coins corresponding to this amount.
https://riptutorial.com/ 176
Canonical money systems. For some money system, like the ones we use in the real life, the
"intuitive" solution works perfectly. For example, if the different euro coins and bills (excluding
cents) are 1€, 2€, 5€, 10€, giving the highest coin or bill until we reach the amount and repeating
this procedure will lead to the minimal set of coins.
These systems are made so that change-making is easy. The problem gets harder when it comes
to arbitrary money system.
General case. How to give 99€ with coins of 10€, 7€ and 5€? Here, giving coins of 10€ until we
are left with 9€ leads obviously to no solution. Worse than that a solution may not exist. This
problem is in fact np-hard, but acceptable solutions mixing greediness and memoization exist.
The idea is to explore all the possibilies and pick the one with the minimal number of coins.
To give an amount X > 0, we choose a piece P in the money system, and then solve the sub-
problem corresponding to X-P. We try this for all the pieces of the system. The solution, if it exists,
is then the smallest path that led to 0.
Here an OCaml recursive function corresponding to this method. It returns None, if no solution
exists.
(* option utilities *)
let optmin x y =
match x,y with
| None,a | a,None -> a
| Some x, Some y-> Some (min x y)
(* Change-making problem*)
let change_make money_system amount =
let rec loop n =
let onepiece acc piece =
match n - piece with
| 0 -> (*problem solved with one coin*)
Some 1
| x -> if x < 0 then
(*we don't reach 0, we discard this solution*)
None
else
(*we search the smallest path different to None with the remaining pieces*)
optmin (optsucc (loop x)) acc
https://riptutorial.com/ 177
in
(*we call onepiece forall the pieces*)
List.fold_left onepiece None money_system
in loop amount
Note: We can remark that this procedure may compute several times the change set for the same
value. In practice, using memoization to avoid these repetitions leads to faster (way faster) results.
The Problem
You have a set of things to do (activities). Each activity has a start time and a end time. You aren't
allowed to perform more than one activity at a time. Your task is to find a way to perform the
maximum number of activities.
Remember, you can't take two classes at the same time. That means you can't take class 1 and 2
because they share a common time 10.30 A.M to 11.00 A.M. However, you can take class 1 and 3
because they don't share a common time. So your task is to take maximum number of classes as
possible without any overlap. How can you do that?
Analysis
Lets think for the solution by greedy approach.First of all we randomly chose some approach and
check that will work or not.
• sort the activity by start time that means which activity start first we will take them first.
then take first to last from sorted list and check it will intersect from previous taken activity or
not. If the current activity is not intersect with the previously taken activity, we will perform the
activity otherwise we will not perform. this approach will work for some cases like
https://riptutorial.com/ 178
Activity No. start time end time
the sorting order will be 4-->1-->2-->3 .The activity 4--> 1--> 3 will be performed and the activity 2
will be skipped. the maximum 3 activity will be performed. It works for this type of cases. but it will
fail for some cases. Lets apply this approach for the case
The sort order will be 4-->1-->2-->3 and only activity 4 will be performed but the answer can be
activity 1-->3 or 2-->3 will be performed. So our approach will not work for the above case. Let's
try another approach
• Sort the activity by time duration that means perform the shortest activity first. that can
solve the previous problem . Although the problem is not completely solved. There still some
cases that can fail the solution. apply this approach on the case bellow.
if we sort the activity by time duration the sort order will be 2--> 3 --->1 . and if we perform activity
No. 2 first then no other activity can be performed. But the answer will be perform activity 1 then
perform 3 . So we can perform maximum 2 activity.So this can not be a solution of this problem.
We should try a different approach.
The solution
• Sort the Activity by ending time that means the activity finishes first that come first. the
algorithm is given below
https://riptutorial.com/ 179
1. Sort the activities by its ending times.
2. If the activity to be performed do not share a common time with the activities that
previously performed, perform the activity.
sort the activity by its ending times , So sort order will be 1-->5-->2-->4-->3.. the answer is 1-->3
these two activities will be performed. ans that's the answer. here is the sudo code.
1. sort: activities
2. perform first activity from the sorted list of activities.
3. Set : Current_activity := first activity
4. set: end_time := end_time of Current activity
5. go to next activity if exist, if not exist terminate .
6. if start_time of current activity <= end_time : perform the activity and go to 4
7. else: got to 5.
https://riptutorial.com/ 180
Chapter 31: Hash Functions
Examples
Introduction to hash functions
Hash function h() is an arbitrary function which mapped data x ∈ X of arbitrary size to value y ∈ Y
of fixed size: y = h(x). Good hash functions have follows restrictions:
• hash functions is deterministic. h(x) should always return the same value for a given x
In general case size of hash function less then size of input data: |y| < |x|. Hash functions are not
reversible or in other words it may be collision: ∃ x1, x2 ∈ X, x1 ≠ x2: h(x1) = h(x2). X may be
finite or infinite set and Y is finite set.
Hash functions are used in a lot of parts of computer science, for example in software engineering,
cryptography, databases, networks, machine learning and so on. There are many different types
of hash functions, with differing domain specific properties.
Often hash is an integer value. There are special methods in programmning languages for hash
calculating. For example, in C# GetHashCode() method for all types returns Int32 value (32 bit integer
number). In Java every class provides hashCode() method which return int. Each data type has own
or user defined implementations.
Hash methods
There are several approaches for determinig hash function. Without loss of generality, lets x ∈ X =
{z ∈ ℤ: z ≥ 0} are positive integer numbers. Often m is prime (not too close to an exact power of
2).
Hash table
Hash functions used in hash tables for computing index into an array of slots. Hash table is data
structure for implementing dictionaries (key-value structure). Good implemented hash tables have
https://riptutorial.com/ 181
O(1) time for the next operations: insert, search and delete data by key. More than one keys may
hash to the same slot. There are two ways for resolving collision:
1. Chaining: linked list is used for storing elements with the same hash value in slot
The next methods are used to compute the probe sequences required for open addressing
Method Formula
Where i ∈ {0, 1, ..., m-1}, h'(x), h1(x), h2(x) are auxiliary hash functions, c1, c2 are positive
auxiliary constants.
Examples
Lets x ∈ U{1, 1000}, h = x mod m. The next table shows the hash values in case of not prime and
prime. Bolded text indicates the same hash values.
723 23 16
103 3 2
738 38 31
292 92 90
61 61 61
87 87 87
995 95 86
549 49 44
991 91 82
757 57 50
920 20 11
https://riptutorial.com/ 182
x m = 100 (not prime) m = 101 (prime)
626 26 20
557 57 52
831 31 23
619 19 13
Links
• Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein. Introduction to
Algorithms.
The hash codes produced by GetHashCode() method for built-in and common C# types from the
System namespace are shown below.
Boolean
SByte
Char
Int16
Int64, Double
https://riptutorial.com/ 183
Xor between lower and upper 32 bits of 64 bit number
Decimal
Object
RuntimeHelpers.GetHashCode(this);
String
Hash code computation depends on the platform type (Win32 or Win64), feature of using
randomized string hashing, Debug / Release mode. In case of Win64 platform:
ValueType
The first non-static field is look for and get it's hashcode. If the type has no non-static fields, the
hashcode of the type returns. The hashcode of a static member can't be taken because if that
member is of the same type as the original type, the calculating ends up in an infinite loop.
Nullable<T>
https://riptutorial.com/ 184
Array
int ret = 0;
for (int i = (Length >= 8 ? Length - 8 : 0); i < Length; i++)
{
ret = ((ret << 5) + ret) ^ comparer.GetHashCode(GetValue(i));
}
References
• GitHub .Net Core CLR
https://riptutorial.com/ 185
Chapter 32: Heap Sort
Examples
Heap Sort Basic Information
Heap sort is a comparison based sorting technique on binary heap data structure. It is similar to
selection sort in which we first find the maximum element and put it at the end of the data
structure. Then repeat the same process for the remaining items.
https://riptutorial.com/ 186
C# Implementation
if (largest != i)
{
var temp = input[i];
input[i] = input[largest];
input[largest] = temp;
Heapify(input, n, largest);
}
}
https://riptutorial.com/ 187
Chapter 33: Insertion Sort
Remarks
When we analyze the performance of the sorting algorithm, we interest primarily on the number of
comparison and exchange.
Average Exchange
Let En be the total average number of exchange to sort array of N element. E1 = 0 (we do not
need any exchange for array with one element). The average number of exchange to sort N
element array is the sum of average number of number of exchange to sort N-1 element array with
the average exchange to insert the last element into N-1 element array.
Average Comparison
Let Cn be the total average number of comparison to sort array of N element. C1 = 0 (we do not
need any comparison on one element array). The average number of comparison to sort N
element array is the sum of average number of number of comparison to sort N-1 element array
with the average comparison to insert the last element into N-1 element array. If last element is
largest element, we need only one comparison, if the last element is the second smallest element,
we need N-1 comparison. However, if last element is the smallest element, we do not need N
comparison. We still only need N-1 comparison. That is why we remove 1/N in below equation.
https://riptutorial.com/ 188
Expand the term
Examples
Algorithm Basics
Insertion sort is a very simple, stable, in-place sorting algorithm. It performs well on small
sequences but it is much less efficient on large lists. At every step, the algorithms considers the i-
th element of the given sequence, moving it to the left until it is in the correct position.
Graphical Illustration
https://riptutorial.com/ 189
Pseudocode
for j = 1 to length(A)
n = A[j]
i = j - 1
while j > 0 and A[i] > n
A[i + 1] = A[i]
i = i - 1
A[i + 1] = n
Example
[5, 2, 4, 6, 1, 3]
1. [5, 2, 4, 6, 1, 3]
2. [2, 5, 4, 6, 1, 3]
3. [2, 4, 5, 6, 1, 3]
https://riptutorial.com/ 190
4. [2, 4, 5, 6, 1, 3]
5. [1, 2, 4, 5, 6, 3]
6. [1, 2, 3, 4, 5, 6]
C# Implementation
Haskell Implementation
https://riptutorial.com/ 191
Chapter 34: Integer Partition Algorithm
Examples
Basic Information of Integer Partition Algorithm
The partition of an integer is a way of writing it as a sum of positive integers. For example, the
partitions of the number 5 are:
• 5
• 4+1
• 3+2
• 2+2+1
• 2+1+1+1
• 1+1+1+1+1
Notice that changing the order of the summands will not create a different partition.
The partition function is inherently recursive in nature since the results of smaller numbers appear
as components in the result of a larger number. Let p(n,m) be the number of partitions of n using
only positive integers that are less than or equal to m. It may be seen that p(n) = p(n,n), and also
p(n,m) = p(n,n) = p(n) for m > n.
https://riptutorial.com/ 192
Auxiliary Space: O(n^2)
Time Complexity: O(n(logn))
https://riptutorial.com/ 193
Result[i, j] = Result[i, j - 1] + Result[i - j, j];
}
}
return Result[targetNumber, largestNumber];
}
https://riptutorial.com/ 194
Chapter 35: Knapsack Problem
Remarks
The Knapsack problem mostly arises in resources allocation mechanisms. The name "Knapsack"
was first introduced by Tobias Dantzig.
Examples
Knapsack Problem Basics
The Problem: Given a set of items where each item contains a weight and value, determine the
number of each to include in a collection so that the total weight is less than or equal to a given
limit and the total value is as large as possible.
Given:
1. Values(array v)
2. Weights(array w)
3. Number of distinct items(n)
4. Capacity(W)
https://riptutorial.com/ 195
val = [60, 100, 120]
wt = [10, 20, 30]
W = 50
n = len(val)
print(knapSack(W, wt, val, n))
$ python knapSack.py
220
Time Complexity of the above code: O(nW) where n is the number of items and W is the capacity of
knapsack.
Solution Implemented in C#
https://riptutorial.com/ 196
Chapter 36: Knuth Morris Pratt (KMP)
Algorithm
Introduction
The KMP is a pattern matching algorithm which searches for occurrences of a "word" W within a
main "text string" S by employing the observation that when a mismatch occurs, we have the
sufficient information to determine where the next match could begin.We take advantage of this
information to avoid matching the characters that we know will anyway match.The worst case
complexity for searching a pattern reduces to O(n).
Examples
KMP-Example
Algorithm
This algorithm is a two step process.First we create a auxiliary array lps[] and then use this array
for searching the pattern.
Preprocessing :
1. We pre-process the pattern and create an auxiliary array lps[] which is used to skip
characters while matching.
2. Here lps[] indicates longest proper prefix which is also suffix.A proper prefix is prefix in which
whole string is not included.For example, prefixes of string ABC are “ ”, “A”, “AB” and
“ABC”. Proper prefixes are “ ”, “A” and “AB”. Suffixes of the string are “ ”, “C”, “BC” and
“ABC”.
Searching
1. We keep matching characters txt[i] and pat[j] and keep incrementing i and j while pat[j] and
txt[i] keep matching.
2. When we see a mismatch,we know that characters pat[0..j-1] match with txt[i-j+1…i-1].We
also know that lps[j-1] is count of characters of pat[0…j-1] that are both proper prefix and
suffix.From this we can conclude that we do not need to match these lps[j-1] characters with
txt[i-j…i-1] because we know that these characters will match anyway.
Implementaion in Java
https://riptutorial.com/ 197
String pattern = "abc";
KMP obj = new KMP();
System.out.println(obj.patternExistKMP(str.toCharArray(), pattern.toCharArray()));
}
lps[0] = 0;
int j = 0;
for(int i =1;i<str.length;i++){
if(str[j] == str[i]){
lps[i] = j+1;
j++;
i++;
}else{
if(j!=0){
j = lps[j-1];
}else{
lps[i] = j+1;
i++;
}
}
return lps;
}
if(j==pat.length)
return true;
return false;
}
https://riptutorial.com/ 198
Chapter 37: Kruskal's Algorithm
Remarks
Kruskal's Algorithm is a greedy algorithm used to find Minimum Spanning Tree (MST) of a
graph. A minimum spanning tree is a tree which connects all the vertices of the graph and has the
minimum total edge weight.
Kruskal's algorithm does so by repeatedly picking out edges with minimum weight (which are not
already in the MST) and add them to the final result if the two vertices connected by that edge are
not yet connected in the MST, otherwise it skips that edge. Union - Find data structure can be
used to check whether two vertices are already connected in the MST or not. A few properties of
MST are as follows:
Examples
Simple, more detailed implementation
In order to efficiently handle cycle detection, we consider each node as part of a tree. When
adding an edge, we check if its two component nodes are part of distinct trees. Initially, each node
makes up a one-node tree.
The above forest methodology is actually a disjoint-set data structure, which involves three main
operations:
https://riptutorial.com/ 199
subalgo unionSet(v, u: nodes):
vRoot = findSet(v)
uRoot = findSet(u)
uRoot.parent = vRoot
This naive implementation leads to O(n log n) time for managing the disjoint-set data structure,
leading to O(m*n log n) time for the entire Kruskal's algorithm.
We can do two things to improve the simple and sub-optimal disjoint-set subalgorithms:
1. Path compression heuristic: findSet does not need to ever handle a tree with height bigger
than 2. If it ends up iterating such a tree, it can link the lower nodes directly to the root,
optimizing future traversals;
2. Height-based merging heuristic: for each node, store the height of its subtree. When
merging, make the taller tree the parent of the smaller one, thus not increasing anyone's
height.
if vRoot == uRoot:
return
This leads to O(alpha(n)) time for each operation, where alpha is the inverse of the fast-growing
Ackermann function, thus it is very slow growing, and can be considered O(1) for practical
purposes.
https://riptutorial.com/ 200
This makes the entire Kruskal's algorithm O(m log m + m) = O(m log m), because of the initial
sorting.
Note
Path compression may reduce the height of the tree, hence comparing heights of the trees during
union operation might not be a trivial task. Hence to avoid the complexity of storing and calculating
the height of the trees the resulting parent can be picked randomly:
if vRoot == uRoot:
return
if random() % 2 == 0:
vRoot.parent = uRoot
else:
uRoot.parent = vRoot
In practice this randomised algorithm together with path compression for findSet operation will
result in comparable performance, yet much simpler to implement.
Sort the edges by value and add each one to the MST in sorted order, if it doesn't create a cycle.
return MST
https://riptutorial.com/ 201
Chapter 38: Line Algorithm
Introduction
Line drawing is accomplished by calculating intermediate positions along the line path between
two specified endpoint positions. An output device is then directed to fill in these positions between
the endpoints.
Examples
Bresenham Line Drawing Algorithm
Background Theory: Bresenham’s Line Drawing Algorithm is an efficient and accurate raster line
generating algorithm developed by Bresenham. It involves only integer calculation so it is accurate
and fast. It can also be extended to display circles another curves.
3. Calculate
Delx =| x2 – x1 |
Dely = | y2 – y1 |
If p < 0 then
X1 = x1 + 1
Pot(x1,y1)
P = p+ 2dely
https://riptutorial.com/ 202
Else
X1 = x1 + 1
Y1 = y1 + 1
Plot(x1,y1)
P = p + 2dely – 2 * delx
End if
End for
6. END
Source Code:
int main()
{
int gdriver=DETECT,gmode;
int x1,y1,x2,y2,delx,dely,p,i;
initgraph(&gdriver,&gmode,"c:\\TC\\BGI");
putpixel(x1,y1,RED);
delx=fabs(x2-x1);
dely=fabs(y2-y1);
p=(2*dely)-delx;
for(i=0;i<delx;i++){
if(p<0)
{
x1=x1+1;
putpixel(x1,y1,RED);
p=p+(2*dely);
}
else
{
x1=x1+1;
y1=y1+1;
putpixel(x1,y1,RED);
p=p+(2*dely)-(2*delx);
}
}
getch();
closegraph();
return 0;
}
https://riptutorial.com/ 203
Algorithm for slope |m|>1:
Else
X1 = x1 + 1
Y1 = y1 + 1
Plot(x1,y1)
P = p + 2delx – 2 * dely
End if
End for
6. END
Source Code:
https://riptutorial.com/ 204
else
{
x1=x1+1;
y1=y1+1;
putpixel(x1,y1,RED);
p=p+(2*delx)-(2*dely);
}
}
getch();
closegraph();
return 0;
}
https://riptutorial.com/ 205
Chapter 39: Longest Common Subsequence
Examples
Longest Common Subsequence Explanation
One of the most important implementations of Dynamic Programming is finding out the Longest
Common Subsequence. Let's define some of the basic terminologies first.
Subsequence:
A subsequence is a sequence that can be derived from another sequence by deleting some
elements without changing the order of the remaining elements. Let's say we have a string ABC. If
we erase zero or one or more than one character from this string we get the subsequence of this
string. So the subsequences of string ABC will be {"A", "B", "C", "AB", "AC", "BC", "ABC", " "
}. Even if we remove all the characters, the empty string will also be a subsequence. To find out
the subsequence, for each characters in a string, we have two options - either we take the
character, or we don't. So if the length of the string is n, there are 2n subsequences of that string.
As the name suggest, of all the common subsequencesbetween two strings, the longest common
subsequence(LCS) is the one with the maximum length. For example: The common
subsequences between "HELLOM" and "HMLD" are "H", "HL", "HM" etc. Here "HLL" is the
longest common subsequence which has length 3.
Brute-Force Method:
We can generate all the subsequences of two strings using backtracking. Then we can compare
them to find out the common subsequences. After we'll need to find out the one with the maximum
length. We have already seen that, there are 2n subsequences of a string of length n. It would
take years to solve the problem if our n crosses 20-25.
Let's approach our method with an example. Assume that, we have two strings abcdaf and acbcf.
Let's denote these with s1 and s2. So the longest common subsequence of these two strings will
be "abcf", which has length 4. Again I remind you, subsequences need not be continuous in the
string. To construct "abcf", we ignored "da" in s1 and "c" in s2. How do we find this out using
Dynamic Programming?
We'll start with a table (a 2D array) having all the characters of s1 in a row and all the characters
of s2 in column. Here the table is 0-indexed and we put the characters from 1 to onwards. We'll
traverse the table from left to right for each row. Our table will look like:
0 1 2 3 4 5 6
+-----+-----+-----+-----+-----+-----+-----+-----+
https://riptutorial.com/ 206
| ch | | a | b | c | d | a | f |
+-----+-----+-----+-----+-----+-----+-----+-----+
0 | | | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
1 | a | | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
2 | c | | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
3 | b | | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
4 | c | | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
5 | f | | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
Here each row and column represent the length of the longest common subsequence between
two strings if we take the characters of that row and column and add to the prefix before it. For
example: Table[2][3] represents the length of the longest common subsequence between "ac"
and "abc".
The 0-th column represents the empty subsequence of s1. Similarly the 0-th row represents the
empty subsequence of s2. If we take an empty subsequence of a string and try to match it with
another string, no matter how long the length of the second substring is, the common
subsequence will have 0 length. So we can fill-up the 0-th rows and 0-th columns with 0's. We get:
0 1 2 3 4 5 6
+-----+-----+-----+-----+-----+-----+-----+-----+
| ch | | a | b | c | d | a | f |
+-----+-----+-----+-----+-----+-----+-----+-----+
0 | | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+-----+-----+-----+-----+-----+-----+-----+-----+
1 | a | 0 | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
2 | c | 0 | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
3 | b | 0 | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
4 | c | 0 | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
5 | f | 0 | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
Let's begin. When we're filling Table[1][1], we're asking ourselves, if we had a string a and
another string a and nothing else, what will be the longest common subsequence here? The
length of the LCS here will be 1. Now let's look at Table[1][2]. We have string ab and string a. The
length of the LCS will be 1. As you can see, the rest of the values will be also 1 for the first row as
it considers only string a with abcd, abcda, abcdaf. So our table will look like:
0 1 2 3 4 5 6
+-----+-----+-----+-----+-----+-----+-----+-----+
| ch | | a | b | c | d | a | f |
+-----+-----+-----+-----+-----+-----+-----+-----+
0 | | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+-----+-----+-----+-----+-----+-----+-----+-----+
1 | a | 0 | 1 | 1 | 1 | 1 | 1 | 1 |
https://riptutorial.com/ 207
+-----+-----+-----+-----+-----+-----+-----+-----+
2 | c | 0 | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
3 | b | 0 | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
4 | c | 0 | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
5 | f | 0 | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
For row 2, which will now include c. For Table[2][1] we have ac on one side and a on the other
side. So the length of the LCS is 1. Where did we get this 1 from? From the top, which denotes the
LCS a between two substrings. So what we are saying is, if s1[2] and s2[1] are not same, then
the length of the LCS will be the maximum of the length of LCS at the top, or at the left. Taking
the length of the LCS at the top denotes that, we don't take the current character from s2.
Similarly, Taking the length of the LCS at the left denotes that, we don't take the current character
from s1 to create the LCS. We get:
0 1 2 3 4 5 6
+-----+-----+-----+-----+-----+-----+-----+-----+
| ch | | a | b | c | d | a | f |
+-----+-----+-----+-----+-----+-----+-----+-----+
0 | | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+-----+-----+-----+-----+-----+-----+-----+-----+
1 | a | 0 | 1 | 1 | 1 | 1 | 1 | 1 |
+-----+-----+-----+-----+-----+-----+-----+-----+
2 | c | 0 | 1 | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
3 | b | 0 | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
4 | c | 0 | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
5 | f | 0 | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
Moving on, for Table[2][2] we have string ab and ac. Since c and b are not same, we put the
maximum of the top or left here. In this case, it's again 1. After that, for Table[2][3] we have string
abc and ac. This time current values of both row and column are same. Now the length of the LCS
will be equal to the maximum length of LCS so far + 1. How do we get the maximum length of LCS
so far? We check the diagonal value, which represents the best match between ab and a. From
this state, for the current values, we added one more character to s1 and s2 which happened to
be the same. So the length of LCS will of course increase. We'll put 1 + 1 = 2 in Table[2][3]. We
get,
0 1 2 3 4 5 6
+-----+-----+-----+-----+-----+-----+-----+-----+
| ch | | a | b | c | d | a | f |
https://riptutorial.com/ 208
+-----+-----+-----+-----+-----+-----+-----+-----+
0 | | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+-----+-----+-----+-----+-----+-----+-----+-----+
1 | a | 0 | 1 | 1 | 1 | 1 | 1 | 1 |
+-----+-----+-----+-----+-----+-----+-----+-----+
2 | c | 0 | 1 | 1 | 2 | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
3 | b | 0 | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
4 | c | 0 | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
5 | f | 0 | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
We have defined both the cases. Using these two formulas, we can populate the whole table. After
filling up the table, it will look like this:
0 1 2 3 4 5 6
+-----+-----+-----+-----+-----+-----+-----+-----+
| ch | | a | b | c | d | a | f |
+-----+-----+-----+-----+-----+-----+-----+-----+
0 | | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+-----+-----+-----+-----+-----+-----+-----+-----+
1 | a | 0 | 1 | 1 | 1 | 1 | 1 | 1 |
+-----+-----+-----+-----+-----+-----+-----+-----+
2 | c | 0 | 1 | 1 | 2 | 2 | 2 | 2 |
+-----+-----+-----+-----+-----+-----+-----+-----+
3 | b | 0 | 1 | 2 | 2 | 2 | 2 | 2 |
+-----+-----+-----+-----+-----+-----+-----+-----+
4 | c | 0 | 1 | 2 | 3 | 3 | 3 | 3 |
+-----+-----+-----+-----+-----+-----+-----+-----+
5 | f | 0 | 1 | 2 | 3 | 3 | 3 | 4 |
+-----+-----+-----+-----+-----+-----+-----+-----+
The length of the LCS between s1 and s2 will be Table[5][6] = 4. Here, 5 and 6 are the length of
s2 and s1 respectively. Our pseudo-code will be:
https://riptutorial.com/ 209
endif
endfor
endfor
Return Table[s2.length][s1.length]
The time complexity for this algorithm is: O(mn) where m and n denotes the length of each
strings.
How do we find out the longest common subsequence? We'll start from the bottom-right corner.
We will check from where the value is coming. If the value is coming from the diagonal, that is if
Table[i-1][j-1] is equal to Table[i][j] - 1, we push either s2[i] or s1[j] (both are the same) and
move diagonally. If the value is coming from top, that means, if Table[i-1][j] is equal to Table[i][j],
we move to the top. If the value is coming from left, that means, if Table[i][j-1] is equal to
Table[i][j], we move to the left. When we reach the leftmost or topmost column, our search ends.
Then we pop the values from the stack and print them. The pseudo-code:
Point to be noted: if both Table[i-1][j] and Table[i][j-1] is equal to Table[i][j] and Table[i-1][j-1] is
not equal to Table[i][j] - 1, there can be two LCS for that moment. This pseudo-code doesn't
consider this situation. You'll have to solve this recursively to find multiple LCSs.
https://riptutorial.com/ 210
Chapter 40: Longest Increasing Subsequence
Examples
Longest Increasing Subsequence Basic Information
The Longest Increasing Subsequence problem is to find subsequence from the give input
sequence in which subsequence's elements are sorted in lowest to highest order. All subsequence
are not contiguous or unique.
Algorithms like Longest Increasing Subsequence, Longest Common Subsequence are used in
version control systems like Git and etc.
Now let us consider a simpler example of the LCS problem. Here, input is only one sequence of
distinct integers a1,a2,...,an., and we want to find the longest increasing subsequence in it. For
example, if input is 7,3,8,4,2,6 then the longest increasing subsequence is 3,4,6.
The easiest approach is to sort input elements in increasing order, and apply the LCS algorithm to
the original and sorted sequences. However, if you look at the resulting array you would notice
that many values are the same, and the array looks very repetitive. This suggest that the LIS
(longest increasing subsequence) problem can be done with dynamic programming algorithm
using only one-dimensional array.
Pseudo Code:
The following program uses A to compute an optimal solution. The first part computes a value m
such that A(m) is the length of an optimal increasing subsequence of input. The second part
computes an optimal increasing subsequence, but for convenience we print it out in reverse order.
https://riptutorial.com/ 211
This program runs in time O(n), so the entire algorithm runs in time O(n^2).
Part 1:
m ← 1
for i : 2..n
if A(i) > A(m) then
m ← i
end if
end for
Part 2:
put a
while A(m) > 1 do
i ← m−1
while not(ai < am and A(i) = A(m)−1) do
i ← i−1
end while
m ← i
put a
end while
Recursive Solution:
Approach 1:
LIS(A[1..n]):
if (n = 0) then return 0
m = LIS(A[1..(n − 1)])
B is subsequence of A[1..(n − 1)] with only elements less than a[n]
(* let h be size of B, h ≤ n-1 *)
m = max(m, 1 + LIS(B[1..h]))
Output m
Approach 2:
LIS(A[1..n], x):
if (n = 0) then return 0
m = LIS(A[1..(n − 1)], x)
if (A[n] < x) then
m = max(m, 1 + LIS(A[1..(n − 1)], A[n]))
Output m
MAIN(A[1..n]):
return LIS(A[1..n], ∞)
Approach 3:
LIS(A[1..n]):
https://riptutorial.com/ 212
if (n = 0) return 0
m = 1
for i = 1 to n − 1 do
if (A[i] < A[n]) then
m = max(m, 1 + LIS(A[1..i]))
return m
MAIN(A[1..n]):
return LIS(A[1..i])
Iterative Algorithm:
LIS(A[1..n]):
Array L[1..n]
(* L[i] = value of LIS ending(A[1..i]) *)
for i = 1 to n do
L[i] = 1
for j = 1 to i − 1 do
if (A[j] < A[i]) do
L[i] = max(L[i], 1 + L[j])
return L
MAIN(A[1..n]):
L = LIS(A[1..n])
return the maximum value in L
Lets take {0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15} as input. So, Longest Increasing
Subsequence for the given input is {0, 2, 6, 9, 11, 15}.
C# Implementation
https://riptutorial.com/ 213
}
for (int i = 0; i < n; i++)
{
if (max < lis[i])
max = lis[i];
}
return max;
}
https://riptutorial.com/ 214
Chapter 41: Lowest common ancestor of a
Binary Tree
Introduction
Lowest common ancestor between two nodes n1 and n2 is defined as the lowest node in the tree
that has both n1 and n2 as descendants.
Examples
Finding lowest common ancestor
https://riptutorial.com/ 215
Chapter 42: Matrix Exponentiation
Examples
Matrix Exponentiation to Solve Example Problems
Find f(n): nth Fibonacci number. The problem is quite easy when n is relatively small. We can use
simple recursion, f(n) = f(n-1) + f(n-2), or we can use dynamic programming approach to avoid
the calculation of same function over and over again. But what will you do if the problem says,
Given 0 < n < 10⁹, find f(n) mod 999983? Dynamic programming will fail, so how do we tackle
this problem?
First let's see how matrix exponentiation can help to represent recursive relation.
Prerequisites:
• Given two matrices, know how to find their product. Further, given the product matrix of two
matrices, and one of them, know how to find the other matrix.
• Given a matrix of size d X d, know how to find its nth power in O(d3log(n)).
Patterns:
At first we need a recursive relation and we want to find a matrix M which can lead us to the
desired state from a set of already known states. Let's assume that, we know the k states of a
given recurrence relation and we want to find the (k+1)th state. Let M be a k X k matrix, and we
build a matrix A:[k X 1] from the known states of the recurrence relation, now we want to get a
matrix B:[k X 1] which will represent the set of next states, i. e. M X A = B as shown below:
| f(n) | | f(n+1) |
| f(n-1) | | f(n) |
M X | f(n-2) | = | f(n-1) |
| ...... | | ...... |
| f(n-k) | |f(n-k+1)|
So, if we can design M accordingly, our job will be done! The matrix will then be used to represent
the recurrence relation.
Type 1:
Let's start with the simplest one, f(n) = f(n-1) + f(n-2)
We get, f(n+1) = f(n) + f(n-1).
Let's assume, we know f(n) and f(n-1); We want to find out f(n+1).
From the situation stated above, matrix A and matrix B can be formed as shown below:
Matrix A Matrix B
| f(n) | | f(n+1) |
| f(n-1) | | f(n) |
https://riptutorial.com/ 216
[Note: Matrix A will be always designed in such a way that, every state on which f(n+1) depends,
will be present]
Now, we need to design a 2X2 matrix M such that, it satisfies M X A = B as stated above.
The first element of B is f(n+1) which is actually f(n) + f(n-1). To get this, from matrix A, we need,
1 X f(n) and 1 X f(n-1). So the first row of M will be [1 1].
| 1 1 | X | f(n) | = | f(n+1) |
| ----- | | f(n-1) | | ------ |
| 1 1 | X | f(n) | = | f(n+1) |
| 1 0 | | f(n-1) | | f(n) |
Type 2:
Let's make it a little complex: find f(n) = a X f(n-1) + b X f(n-2), where a and b are constants.
This tells us, f(n+1) = a X f(n) + b X f(n-1).
By this far, this should be clear that the dimension of the matrices will be equal to the number of
dependencies, i.e. in this particular example, again 2. So for A and B, we can build two matrices of
size 2 X 1:
Matrix A Matrix B
| f(n) | | f(n+1) |
| f(n-1) | | f(n) |
Now for f(n+1) = a X f(n) + b X f(n-1), we need [a, b] in the first row of objective matrix M. And
for the 2nd item in B, i.e. f(n) we already have that in matrix A, so we just take that, which leads,
the 2nd row of the matrix M to [1 0]. This time we get:
| a b | X | f(n) | = | f(n+1) |
| 1 0 | | f(n-1) | | f(n) |
Type 3:
If you've survived through to this stage, you've grown much older, now let's face a bit complex
relation: find f(n) = a X f(n-1) + c X f(n-3)?
Ooops! A few minutes ago, all we saw were contiguous states, but here, the state f(n-2) is
https://riptutorial.com/ 217
missing. Now?
Actually this is not a problem anymore, we can convert the relation as follows: f(n) = a X f(n-1) +
0 X f(n-2) + c X f(n-3), deducing f(n+1) = a X f(n) + 0 X f(n-1) + c X f(n-2). Now, we see that,
this is actually a form described in Type 2. So here the objective matrix M will be 3 X 3, and the
elements are:
| a 0 c | | f(n) | | f(n+1) |
| 1 0 0 | X | f(n-1) | = | f(n) |
| 0 1 0 | | f(n-2) | | f(n-1) |
These are calculated in the same way as type 2, if you find it difficult, try it on pen and paper.
Type 4:
Life is getting complex as hell, and Mr, Problem now asks you to find f(n) = f(n-1) + f(n-2) + c
where c is any constant.
Now this is a new one and all we have seen in past, after the multiplication, each state in A
transforms to its next state in B.
So , normally we can't get it through previous fashion, but how about we add c as a state:
| f(n) | | f(n+1) |
M X | f(n-1) | = | f(n) |
| c | | c |
Now, its not much hard to design M. Here's how its done, but don't forget to verify:
| 1 1 1 | | f(n) | | f(n+1) |
| 1 0 0 | X | f(n-1) | = | f(n) |
| 0 0 1 | | c | | c |
Type 5:
Let's put it altogether: find f(n) = a X f(n-1) + c X f(n-3) + d X f(n-4) + e. Let's leave it as an
exercise for you. First try to find out the states and matrix M. And check if it matches with your
solution. Also find matrix A and B.
| a 0 c d 1 |
| 1 0 0 0 0 |
| 0 1 0 0 0 |
| 0 0 1 0 0 |
| 0 0 0 0 1 |
Type 6:
https://riptutorial.com/ 218
Sometimes the recurrence is given like this:
In short:
Here, we can split the functions in the basis of odd even and keep 2 different matrix for both of
them and calculate them separately.
Type 7:
Feeling little too confident? Good for you. Sometimes we may need to maintain more than one
recurrence, where they are interested. For example, let a recurrence re;atopm be:
Here, recurrence g(n) is dependent upon f(n) and this can be calculated in the same matrix but of
increased dimensions. From these let's at first design the matrices A and B.
Matrix A Matrix B
| g(n) | | g(n+1) |
| g(n-1) | | g(n) |
| f(n+1) | | f(n+2) |
| f(n) | | f(n+1) |
Here, g(n+1) = 2g(n-1) + f(n+1) and f(n+2) = 2f(n+1) + 2f(n). Now, using the processes stated
above, we can find the objective matrix M to be:
| 2 2 1 0 |
| 1 0 0 0 |
| 0 0 2 2 |
| 0 0 1 0 |
So, these are the basic categories of recurrence relations which are used to solveby this simple
technique.
https://riptutorial.com/ 219
Chapter 43: Maximum Path Sum Algorithm
Examples
Maximum Path Sum Basic Information
Maximum Path Sum is an algorithm to find out a path such that sum of element(node) of that path
is greater than any other path.
3
7 4
2 4 6
8 5 9 3
In above triangle, find the maximum path which has maximum sum. Answer is, 3 + 7 + 4 + 9 = 23
To find out the solution, as always we get an idea of Brute-Force method. Brute-Force method is
good for this 4 rows triangle but think about a triangle with 100 or more than 100 rows. So, We can
not use Brute-Force method to solve this problem.
Algorithm:
For each and every node in a triangle or in a binary tree there can be four ways that the max path
goes through the node.
1. Node only
2. Max path through Left Child + Node
3. Max path through Right Child + Node
4. Max path through Left Child + Node + Max path through Right Child.
https://riptutorial.com/ 220
Space Auxiliary: O(n)
Time Complexity: O(n)
C# Implementation
class Res
{
public int Val;
}
https://riptutorial.com/ 221
{
if (node == null) return 0;
int l = FindMaxUtil(node.Left, res);
int r = FindMaxUtil(node.Right, res);
int maxSingle = Math.Max(Math.Max(l, r) + node.Value, node.Value);
int maxTop = Math.Max(maxSingle, l + r + node.Value);
res.Val = Math.Max(res.Val, maxTop);
return maxSingle;
}
int FindMaxSum()
{
return FindMaxSum(_root);
}
https://riptutorial.com/ 222
Chapter 44: Maximum Subarray Algorithm
Examples
Maximum Subarray Algorithm Basic Information
Maximum subarray problem is the method to find the contiguous subarray within a one-
dimensional array of numbers which has the largest sum.
The problem was originally proposed by Ulf Grenander of Brown University in 1977, as a simplified
model for maximum likelihood estimation of patterns in digitized images.
We can problem like this, let us consider a list of various integers. We might be interested in which
completely adjacent subset will have the greatest sum. For example, if we have the array [0, 1,
2, -2, 3, 2], the maximum subarray is [3, 2], with a sum of 5.
This method is most inefficient to find out the solution. In this, we will end up going through every
single possible subarray, and then finding the sum of all of them. At last, compare all values and
find out maximum subarray.
MaxSubarray(array)
maximum = 0
for i in input
current = 0
for j in input
current += array[j]
if current > maximum
maximum = current
return maximum
Time complexity for Brute-Force method is O(n^2). So let's move to divide and conquer
approach.
Find the sum of the subarrays on the left side, the subarrays on the right. Then, take a look
through all of the ones that cross over the center divide, and finally return the maximum sum.
Because this is a divide and conquer algorithm, we need to have two different functions.
maxSubarray(array)
if start = end
return array[start]
else
https://riptutorial.com/ 223
middle = (start + end) / 2
return max(maxSubarray(array(From start to middle)), maxSubarray(array(From middle + 1 to
end)), maxCrossover(array))
In second part, separate the different part that are created in first part.
maxCrossover(array)
currentLeftSum = 0
leftSum = 0
currentRightSum = 0
rightSum = 0
for i in array
currentLeftSum += array[i]
if currentLeftSum > leftSum
leftSum = currentLeftSum
for i in array
currentRightSum += array[i]
if currentRightSum > rightSum
rightSum = currentRightSum
return leftSum + rightSum
Time complexity for Divide and Conquer method is O(nlogn). So let's move to dynamic
programming approach.
This solution is also known as Kadane's Algorithm. It is linear time algorithm. This solution is given
by Joseph B. Kadane in late '70s.
This algorithm just goes through the loop, continuously changing the current maximum sum.
Interestingly enough, this is a very simple example of a dynamic programming algorithm, since it
takes an overlapping problem and reduces it so we can find a more efficient solution.
MaxSubArray(array)
max = 0
currentMax = 0
for i in array
currentMax += array[i]
if currentMax < 0
currentMax = 0
if max < currentMax
max = currentMax
return max
C# Implementation
https://riptutorial.com/ 224
return a > b ? a : b;
}
https://riptutorial.com/ 225
Chapter 45: Merge Sort
Examples
Merge Sort Basics
Merge Sort is a divide-and-conquer algorithm. It divides the input list of length n in half
successively until there are n lists of size 1. Then, pairs of lists are merged together with the
smaller first element among the pair of lists being added in each step. Through successive
merging and through comparison of first elements, the sorted list is built.
An example:
The above recurrence can be solved either using Recurrence Tree method or Master method. It
falls in case II of Master Method and solution of the recurrence is Θ(nLogn). Time complexity of
https://riptutorial.com/ 226
Merge Sort is Θ(nLogn) in all 3 cases (worst, average and best) as merge sort always divides the
array in two halves and take linear time to merge two halves.
Stable: Yes
C Merge Sort
i=0;
j=0;
for(k=l; k<=h; k++) { //process of combining two sorted arrays
if(arr1[i]<=arr2[j])
arr[k]=arr1[i++];
else
arr[k]=arr2[j++];
}
return 0;
}
return 0;
}
https://riptutorial.com/ 227
C# Merge Sort
i = 0;
j = 0;
var k = l;
https://riptutorial.com/ 228
int m = l + (r - l) / 2;
SortMerge(input, l, m);
SortMerge(input, m + 1, r);
Merge(input, l, m, r);
}
}
Below there is the implementation in Java using a generics approach. It is the same algorithm,
which is presented above.
public class MergeSort < T extends Comparable < T >> implements InPlaceSort < T > {
@Override
public void sort(T[] elements) {
T[] arr = (T[]) new Comparable[elements.length];
sort(elements, arr, 0, elements.length - 1);
}
private void merge(T[] a, T[] b, int low, int high, int mid) {
int i = low;
int j = mid + 1;
// We select the smallest element of the two. And then we put it into b
for (int k = low; k <= high; k++) {
https://riptutorial.com/ 229
}
}
def mergeSort(A):
if len(A) <= 1:
return A
if len(A) == 2:
return sorted(A)
mid = len(A) / 2
return merge(mergeSort(A[:mid]), mergeSort(A[mid:]))
if __name__ == "__main__":
# Generate 20 random numbers and sort them
A = [randint(1, 100) for i in xrange(20)]
print mergeSort(A)
public MergeSortBU() {
}
private static void merge(Comparable[] arrayToSort, Comparable[] aux, int lo,int mid, int
hi) {
int i = lo;
int j = mid + 1;
for (int k = lo; k <= hi; k++) {
https://riptutorial.com/ 230
if (i > mid)
arrayToSort[k] = aux[j++];
else if (j > hi)
arrayToSort[k] = aux[i++];
else if (isLess(aux[i], aux[j])) {
arrayToSort[k] = aux[i++];
} else {
arrayToSort[k] = aux[j++];
}
}
}
public static void sort(Comparable[] arrayToSort, Comparable[] aux, int lo, int hi) {
int N = arrayToSort.length;
for (int sz = 1; sz < N; sz = sz + sz) {
for (int low = 0; low < N; low = low + sz + sz) {
System.out.println("Size:"+ sz);
merge(arrayToSort, aux, low, low + sz -1 ,Math.min(low + sz + sz - 1, N - 1));
print(arrayToSort);
}
}
package main
import "fmt"
https://riptutorial.com/ 231
f := mergeSort(a[:m])
s := mergeSort(a[m:])
return merge(f, s)
}
return a
}
func main() {
a := []int{75, 12, 34, 45, 0, 123, 32, 56, 32, 99, 123, 11, 86, 33}
fmt.Println(a)
fmt.Println(mergeSort(a))
}
https://riptutorial.com/ 232
Chapter 46: Multithreaded Algorithms
Introduction
Examples for some multithreaded algorithms.
Syntax
• parallel before a loop means each iteration of the loop are independant from each other and
can be run in parallel.
• spawn is to indicate creation of a new thread.
• sync is to synchronize all created threads.
• Arrays/matrix are indexed 1 to n in examples.
Examples
Square matrix multiplication multithread
multiply-square-matrix-parallel(A, B)
n = A.lines
C = Matrix(n,n) //create a new matrix n*n
parallel for i = 1 to n
parallel for j = 1 to n
C[i][j] = 0
pour k = 1 to n
C[i][j] = C[i][j] + A[i][k]*B[k][j]
return C
matrix-vector(A,x)
n = A.lines
y = Vector(n) //create a new vector of length n
parallel for i = 1 to n
y[i] = 0
parallel for i = 1 to n
for j = 1 to n
y[i] = y[i] + A[i][j]*x[j]
return y
merge-sort multithread
A is an array and p and q indexes of the array such as you gonna sort the sub-array A[p..r]. B is a
sub-array which will be populated by the sort.
A call to p-merge-sort(A,p,r,B,s) sorts elements from A[p..r] and put them in B[s..s+r-p].
https://riptutorial.com/ 233
p-merge-sort(A,p,r,B,s)
n = r-p+1
if n==1
B[s] = A[p]
else
T = new Array(n) //create a new array T of size n
q = floor((p+r)/2))
q_prime = q-p+1
spawn p-merge-sort(A,p,q,T,1)
p-merge-sort(A,q+1,r,T,q_prime+1)
sync
p-merge(T,1,q_prime,q_prime+1,n,B,s)
p-merge(T,p1,r1,p2,r2,A,p3)
n1 = r1-p1+1
n2 = r2-p2+1
if n1<n2 //check if n1>=n2
permute p1 and p2
permute r1 and r2
permute n1 and n2
if n1==0 //both empty?
return
else
q1 = floor((p1+r1)/2)
q2 = dichotomic-search(T[q1],T,p2,r2)
q3 = p3 + (q1-p1) + (q2-p2)
A[q3] = T[q1]
spawn p-merge(T,p1,q1-1,p2,q2-1,A,p3)
p-merge(T,q1+1,r1,q2,r2,A,q3+1)
sync
dichotomic-search(x,T,p,r)
inf = p
sup = max(p,r+1)
while inf<sup
half = floor((inf+sup)/2)
if x<=T[half]
sup = half
else
inf = half+1
return sup
https://riptutorial.com/ 234
Chapter 47: Odd-Even Sort
Examples
Odd-Even Sort Basic Information
An Odd-Even Sort or brick sort is a simple sorting algorithm, which is developed for use on parallel
processors with local interconnection. It works by comparing all odd/even indexed pairs of
adjacent elements in the list and, if a pair is in the wrong order the elements are switched. The
next step repeats this for even/odd indexed pairs. Then it alternates between odd/even and
even/odd steps until the list is sorted.
if n>2 then
1. apply odd-even merge(n/2) recursively to the even subsequence a0, a2, ..., an-2 and to
the odd subsequence a1, a3, , ..., an-1
2. comparison [i : i+1] for all i element {1, 3, 5, 7, ..., n-3}
else
comparison [0 : 1]
https://riptutorial.com/ 235
Implementation:
while (!sort)
{
sort = true;
for (var i = 1; i < n - 1; i += 2)
{
if (input[i] <= input[i + 1]) continue;
var temp = input[i];
input[i] = input[i + 1];
input[i + 1] = temp;
sort = false;
}
for (var i = 0; i < n - 1; i += 2)
{
if (input[i] <= input[i + 1]) continue;
var temp = input[i];
input[i] = input[i + 1];
input[i + 1] = temp;
https://riptutorial.com/ 236
sort = false;
}
}
}
https://riptutorial.com/ 237
Chapter 48: Online algorithms
Remarks
Theory
Definition 1: An optimization problem Π consists of a set of instances ΣΠ. For every instance
σ∈ΣΠ there is a set Ζσ of solutions and a objective function fσ : Ζσ → ℜ≥0 which assigns
apositive real value to every solution.
We say OPT(σ) is the value of an optimal solution, A(σ) is the solution of an Algorithm A for the
problem Π and wA(σ)=fσ(A(σ)) its value.
wA(σ) ≤ r ⋅ OPT(&sigma)
for all instances σ∈ΣΠ then A is called a strictly r-competitive online algorithm.
Proof: At the beginning of each phase (except for the first one) FWF has a cache miss and
cleared the cache. that means we have k empty pages. In every phase are maximal k different
pages requested, so there will be now eviction during the phase. So FWF is a marking algorithm.
Lets assume LRU is not a marking algorithm. Then there is an instance σ where LRU a marked
page x in phase i evicted. Let σt the request in phase i where x is evicted. Since x is marked there
has to be a earlier request σt* for x in the same phase, so t* < t. After t* x is the caches newest
page, so to got evicted at t the sequence σt*+1,...,σt has to request at least k from x different
pages. That implies the phase i has requested at least k+1 different pages which is a contradictory
to the phase definition. So LRU has to be a marking algorithm.
Proof: Let σ be an instance for the paging problem and l the number of phases for σ. Is l = 1 then
is every marking algorithm optimal and the optimal offline algorithm cannot be better.
We assume l ≥ 2. the cost of every marking algorithm for instance σ is bounded from above with l ⋅
k because in every phase a marking algorithm cannot evict more than k pages without evicting one marked
page.
Now we try to show that the optimal offline algorithm evicts at least k+l-2 pages for σ, k in the first
phase and at least one for every following phase except for the last one. For proof lets define l-2
disjunct subsequences of σ. Subsequence i ∈ {1,...,l-2} starts at the second position of phase i+1
https://riptutorial.com/ 238
and end with the first position of phase i+2.
Let x be the first page of phase i+1. At the beginning of subsequence i there is page x and at most
k-1 different pages in the optimal offline algorithms cache. In subsequence i are k page request
different from x, so the optimal offline algorithm has to evict at least one page for every
subsequence. Since at phase 1 beginning the cache is still empty, the optimal offline algorithm
causes k evictions during the first phase. That shows that
Is there no constant r for which an online algorithm A is r-competitive, we call A not competitive.
Proof: Let l ≥ 2 a constant, k ≥ 2 the cache size. The different cache pages are nubered 1,...,k+1.
We look at the following sequence:
First page 1 is requested l times than page 2 and so one. At the end there are (l-1) alternating
requests for page k and k+1.
LFU and LIFO fill their cache with pages 1-k. When page k+1 is requested page k is evicted and
vice versa. That means every request of subsequence (k,k+1)l-1 evicts one page. In addition their
are k-1 cache misses for the first time use of pages 1-(k-1). So LFU and LIFO evict exact k-1+2(l-
1) pages.
Now we must show that for every constant τ∈ℜ and every constan r ≤ 1 there exists an l so that
which is equal to
To satisfy this inequality you just have to choose l sufficient big. So LFU and LIFO are not
competetive.
Proposition 1.7: There is no r-competetive deterministic online algorithm for paging with r < k.
Sources
Basic Material
https://riptutorial.com/ 239
Further Reading
1. Online Computation and Competetive Analysis by Allan Borodin and Ran El-Yaniv
Source Code
Examples
Paging (Online Caching)
Preface
Instead of starting with a formal definition, the goal is to approach these topic via a row of
examples, introducing definitions along the way. The remark section Theory will consist of all
definitions, theorems and propositions to give you all informations to faster look up specific
aspects.
The remark section sources consists of the basis material used for this topic and additional
information for further reading. In addition you will find the full source codes for the examples
there. Please pay attention that to make the source code for the examples more readable and
shorter it refrains from things like error handling etc. It also passes on some specific language
features which would obscure the clarity of the example like extensive use of advanced libraries
etc.
Paging
The paging problem arises from the limitation of finite space. Let's assume our cache C has k
pages. Now we want to process a sequence of m page requests which must have been placed in
the cache before they are processed. Of course if m<=k then we just put all elements in the cache
and it will work, but usually is m>>k.
We say a request is a cache hit, when the page is already in cache, otherwise, its called a cache
miss. In that case, we must bring the requested page into the cache and evict another, assuming
the cache is full. The Goal is an eviction schedule that minimizes the number of evictions.
There are numerous strategies for this problem, let's look at some:
1. First in, first out (FIFO): The oldest page gets evicted
2. Last in, first out (LIFO): The newest page gets evicted
3. Least recently used (LRU): Evict page whose most recent access was earliest
4. Least frequently used (LFU): Evict page that was least frequently requested
5. Longest forward distance (LFD): Evict page in the cache that is not requested until farthest
in the future.
https://riptutorial.com/ 240
6. Flush when full (FWF): clear the cache complete as soon as a cache miss happened
Offline Approach
For the first approach look at the topic Applications of Greedy technique. It's third Example Offline
Caching considers the first five strategies from above and gives you a good entry point for the
following.
// no free pages
return 0;
}
The full sourcecode is available here. If we reuse the example from the topic, we get the following
output:
Strategy: FWF
https://riptutorial.com/ 241
Request cache 0 cache 1 cache 2 cache miss
a a b c
a a b c
d d X X x
e d e X
b d e b
b d e b
a a X X x
c a c X
f a c f
d d X X x
e d e X
a d e a
f f X X x
b f b X
e f b e
c c X X x
Even though LFD is optimal, FWF has fewer cache misses. But the main goal was to minimize the
number of evictions and for FWF five misses mean 15 evictions, which makes it the poorest
choice for this example.
Online Approach
Now we want to approach the online problem of paging. But first we need an understanding how
to do it. Obviously an online algorithm cannot be better than the optimal offline algorithm. But how
much worse it is? We need formal definitions to answer that question:
Definition 1.1: An optimization problem Π consists of a set of instances ΣΠ. For every instance
σ∈ΣΠ there is a set Ζσ of solutions and a objective function fσ : Ζσ → ℜ≥0 which assigns
apositive real value to every solution.
We say OPT(σ) is the value of an optimal solution, A(σ) is the solution of an Algorithm A for the
problem Π and wA(σ)=fσ(A(σ)) its value.
Definition 1.2: An online algorithm A for a minimization problem Π has a competetive ratio of r ≥
1 if there is a constant τ∈ℜ with
wA(σ) ≤ r ⋅ OPT(σ)
for all instances σ∈ΣΠ then A is called a strictly r-competitive online algorithm.
So the question is how competitive is our online algorithm compared to an optimal offline
algorithm. In their famous book Allan Borodin and Ran El-Yaniv used another scenario to describe
the online paging situation:
https://riptutorial.com/ 242
There is an evil adversary who knows your algorithm and the optimal offline algorithm. In every
step, he tries to request a page which is worst for you and simultaneously best for the offline
algorithm. the competitive factor of your algorithm is the factor on how badly your algorithm did
against the adversary's optimal offline algorithm. If you want to try to be the adversary, you can try
the Adversary Game (try to beat the paging strategies).
Marking Algorithms
Instead of analysing every algorithm separately, let's look at a special online algorithm family for
the paging problem called marking algorithms.
Let σ=(σ1,...,σp) an instance for our problem and k our cache size, than σ can be divided into
phases:
• Phase 1 is the maximal subsequence of σ from the start till maximal k different pages are
requested
• Phase i ≥ 2 is the maximal subsequence of σ from the end of pase i-1 till maximal k different
pages are requested
A marking algorithm (implicitly or explicitly) maintains whether a page is marked or not. At the
beginning of each phase are all pages unmarked. Is a page requested during a phase it gets
marked. An algorithm is a marking algorithm iff it never evicts a marked page from cache. That
means pages which are used during a phase will not be evicted.
Proof: At the beginning of each phase (except for the first one) FWF has a cache miss and
cleared the cache. that means we have k empty pages. In every phase are maximal k different
pages requested, so there will be now eviction during the phase. So FWF is a marking algorithm.
Let's assume LRU is not a marking algorithm. Then there is an instance σ where LRU a marked
page x in phase i evicted. Let σt the request in phase i where x is evicted. Since x is marked there
has to be a earlier request σt* for x in the same phase, so t* < t. After t* x is the caches newest
page, so to got evicted at t the sequence σt*+1,...,σt has to request at least k from x different
pages. That implies the phase i has requested at least k+1 different pages which is a contradictory
to the phase definition. So LRU has to be a marking algorithm.
Proof: Let σ be an instance for the paging problem and l the number of phases for σ. Is l = 1 then
is every marking algorithm optimal and the optimal offline algorithm cannot be better.
https://riptutorial.com/ 243
We assume l ≥ 2. the cost of every marking algorithm, for instance, σ is bounded from above with l
⋅ k because in every phase a marking algorithm cannot evict more than k pages without evicting one marked
page.
Now we try to show that the optimal offline algorithm evicts at least k+l-2 pages for σ, k in the first
phase and at least one for every following phase except for the last one. For proof lets define l-2
disjunct subsequences of σ. Subsequence i ∈ {1,...,l-2} starts at the second position of phase i+1
and end with the first position of phase i+2.
Let x be the first page of phase i+1. At the beginning of subsequence i there is page x and at most
k-1 different pages in the optimal offline algorithms cache. In subsequence i are k page request
different from x, so the optimal offline algorithm has to evict at least one page for every
subsequence. Since at phase 1 beginning the cache is still empty, the optimal offline algorithm
causes k evictions during the first phase. That shows that
Is there no constant r for which an online algorithm A is r-competitive, we call A not competitive
Proof: Let l ≥ 2 a constant, k ≥ 2 the cache size. The different cache pages are nubered 1,...,k+1.
We look at the following sequence:
The first page 1 is requested l times than page 2 and so one. At the end, there are (l-1) alternating
requests for page k and k+1.
LFU and LIFO fill their cache with pages 1-k. When page k+1 is requested page k is evicted and
vice versa. That means every request of subsequence (k,k+1)l-1 evicts one page. In addition, their
are k-1 cache misses for the first time use of pages 1-(k-1). So LFU and LIFO evict exact k-1+2(l-
1) pages.
Now we must show that for every constant τ∈ℜ and every constant r ≤ 1 there exists an l so that
which is equal to
To satisfy this inequality you just have to choose l sufficient big. So LFU and LIFO are not
competitive.
Proposition 1.7: There is no r-competetive deterministic online algorithm for paging with r < k.
https://riptutorial.com/ 244
The proof for this last proposition is rather long and based of the statement that LFD is an optimal
offline algorithm. The interested reader can look it up in the book of Borodin and El-Yaniv (see
sources below).
The Question is whether we could do better. For that, we have to leave the deterministic approach
behind us and start to randomize our algorithm. Clearly, its much harder for the adversary to
punish your algorithm if it's randomized.
https://riptutorial.com/ 245
Chapter 49: Pancake Sort
Examples
Pancake Sort Basic Information
Pancake Sort is a the colloquial term for the mathematical problem of sorting a disordered stack of
pancakes in order of size when a spatula can be inserted at any point in the stack and used to flip
all pancakes above it. A pancake number is the minimum number of flips required for a given
number of pancakes.
Unlike a traditional sorting algorithm, which attempts to sort with the fewest comparisons possible,
the goal is to sort the sequence in as few reversals as possible.
The idea is to do something similar to Selection Sort. We one by one place maximum element at
the end and reduce the size of current array by one.
1. Need to order the pancakes from smallest (top) to largest (bottom), the starting stack can be
arranged in any order.
2. I only can perform flip flipping the entire stack.
3. To flip a specific pancake to the bottom of the stack, we first must flip it to the top (then flip it
again to the bottom).
4. To order each pancake will require one flip up to the top and one flip down to its final
location.
Intuitive Algorithm:
1. Find the largest out of order pancake and flip it to the bottom (you may need to flip it to the
top of the stack first).
https://riptutorial.com/ 246
Auxiliary Space: O(1)
Time Complexity: O(n^2)
C# Implementation
https://riptutorial.com/ 247
Chapter 50: Pascal's Triangle
Examples
Pascal's Triagle Basic Information
One of the most interesting Number Patterns is Pascal's Triangle. The Name "Pascal's Triangle"
named after Blaise Pascal, a famous French Mathematician and Philosopher.
For example, the initial number in the first (or any other) row is 1 (the sum of 0 and 1), whereas the
numbers 1 and 3 in the third row are added to produce the number 4 in the fourth row.
for any non-negative integer n and any integer k between 0 and n, inclusive. This recurrence for
the binomial coefficients is known as Pascal's rule. Pascal's triangle has higher dimensional
generalizations. The three-dimensional version is called Pascal's pyramid or Pascal's tetrahedron,
while the general versions are called Pascal's simplices.
https://riptutorial.com/ 248
Implementation of Pascal's Triangle in C#
Pascal triangle in C
while(k != 2*i-1)
{
if (count <= rows-1)
{
printf("%d ", i+k);
https://riptutorial.com/ 249
++count;
}
else
{
++count1;
printf("%d ", (i+k-2*count1));
}
++k;
}
count1 = count = k = 0;
printf("\n");
}
Output
1
2 3 2
3 4 5 4 3
4 5 6 7 6 5 4
5 6 7 8 9 8 7 6 5
https://riptutorial.com/ 250
Chapter 51: Pigeonhole Sort
Examples
Pigeonhole Sort Basic Information
Pigeonhole Sort is a sorting algorithm that is suitable for sorting lists of elements where the
number of elements (n) and the number of possible key values (N) are approximately the same. It
requires O(n + Range) time where n is number of elements in input array and ‘Range’ is number of
possible values in array.
1. Find minimum and maximum values in array. Let the minimum and maximum values be ‘min’
and ‘max’ respectively. Also find range as ‘max-min-1′.
2. Set up an array of initially empty “pigeonholes” the same size as of the range.
3. Visit each element of the array and then put each element in its pigeonhole. An element
input[i] is put in hole at index input[i] – min.
4. Start the loop all over the pigeonhole array in order and put the elements from non- empty
holes back into the original array.
Pigeonhole sort is similar to counting sort, so here is a comparison between Pigeonhole Sort and
counting sort.
https://riptutorial.com/ 251
Space Auxiliary: O(n)
Time Complexity: O(n + N)
C# Implementation
https://riptutorial.com/ 252
}
https://riptutorial.com/ 253
Chapter 52: polynomial-time bounded
algorithm for Minimum Vertex Cover
Introduction
This is a polynomial algorithm for getting the minimum vertex cover of connected undirected
graph. The time complexity of this algorithm is O(n2)
Parameters
Variable Meaning
X Set of vertices
Remarks
The first thing you have to do in this algorithm to get all of the vertices of the graph sorted in
descending order according to its degree.
After that you have iterate on them and add each one to final vertices set which don't have any
adjacent vertex in this set.
In the final stage iterate on the final vertices set and remove all of the vertices which have one of
its adjacent vertices in this set.
Examples
Algorithm Pseudo Code
https://riptutorial.com/ 254
Set X <- new Set<Vertex>()
X <- G.getAllVerticiesArrangedDescendinglyByDegree()
for v in X do
List<Vertex> adjacentVertices1 <- G.getAdjacent(v)
C.add(v)
for vertex in C do
C.remove(vertex)
return C
we can use bucket sort for sorting the vertices according to its degree because the
maximum value of degrees is (n-1) where n is the number of vertices then the time
complexity of the sorting will be O(n)
https://riptutorial.com/ 255
Chapter 53: Prim's Algorithm
Examples
Introduction To Prim's Algorithm
Let's say we have 8 houses. We want to setup telephone lines between these houses. The edge
between the houses represent the cost of setting line between two houses.
Our task is to set up lines in such a way that all the houses are connected and the cost of setting
up the whole connection is minimum. Now how do we find that out? We can use Prim's
Algorithm.
Prim's Algorithm is a greedy algorithm that finds a minimum spanning tree for a weighted
undirected graph. This means it finds a subset of the edges that forms a tree that includes every
node, where the total weight of all the edges in the tree are minimized. The algorithm was
developed in 1930 by Czech mathematician Vojtěch Jarník and later rediscovered and republished
by computer scientist Robert Clay Prim in 1957 and Edsger Wybe Dijkstra in 1959. It is also
known as DJP algorithm, Jarnik's algorithm, Prim-Jarnik algorithm or Prim-Dijsktra
algorithm.
Now let's look at the technical terms first. If we create a graph, S using some nodes and edges of
an undirected graph G, then S is called a subgraph of the graph G. Now S will be called a
Spanning Tree if and only if:
https://riptutorial.com/ 256
• It contains all the nodes of G.
• It is a tree, that means there is no cycle and all the nodes are connected.
• There are (n-1) edges in the tree, where n is the number of nodes in G.
There can be many Spanning Tree's of a graph. The Minimum Spanning Tree of a weighted
undirected graph is a tree, such that sum of the weight of the edges is minimum. Now we'll use
Prim's algorithm to find out the minimum spanning tree, that is how to set up the telephone lines
in our example graph in such way that the cost of set up is minimum.
At first we'll select a source node. Let's say, node-1 is our source. Now we'll add the edge from
node-1 that has the minimum cost to our subgraph. Here we mark the edges that are in the
subgraph using the color blue. Here 1-5 is our desired edge.
Now we consider all the edges from node-1 and node-5 and take the minimum. Since 1-5 is
already marked, we take 1-2.
https://riptutorial.com/ 257
This time, we consider node-1, node-2 and node-5 and take the minimum edge which is 5-4.
The next step is important. From node-1, node-2, node-5 and node-4, the minimum edge is 2-4.
But if we select that one, it'll create a cycle in our subgraph. This is because node-2 and node-4
are already in our subgraph. So taking edge 2-4 doesn't benefit us. We'll select the edges in such
way that it adds a new node in our subgraph. So we select edge 4-8.
https://riptutorial.com/ 258
If we continue this way, we'll select edge 8-6, 6-7 and 4-3. Our subgraph will look like:
This is our desired subgraph, that'll give us the minimum spanning tree. If we remove the edges
that we didn't select, we'll get:
https://riptutorial.com/ 259
This is our minimum spanning tree (MST). So the cost of setting up the telephone connections
is: 4 + 2 + 5 + 11 + 9 + 2 + 1 = 34. And the set of houses and their connections are shown in the
graph. There can be multiple MST of a graph. It depends on the source node we choose.
Complexity:
Time complexity of the above naive approach is O(V²). It uses adjacency matrix. We can reduce
the complexity using priority queue. When we add a new node to Vnew, we can add its adjacent
edges in the priority queue. Then pop the minimum weighted edge from it. Then the complexity will
be: O(ElogE), where E is the number of edges. Again a Binary Heap can be constructed to reduce
the complexity to O(ElogV).
https://riptutorial.com/ 260
key[u] := inf
parent[u] := NULL
end for
key[source] := 0
Q = Priority_Queue()
Q = V
while Q is not empty
u -> Q.pop
for each v adjacent to i
if v belongs to Q and Edge(u,v) < key[v] // here Edge(u, v) represents
// cost of edge(u, v)
parent[v] := u
key[v] := Edge(u, v)
end if
end for
end while
Here key[] stores the minimum cost of traversing node-v. parent[] is used to store the parent
node. It is useful for traversing and printing the tree.
import java.util.*;
https://riptutorial.com/ 261
return -1;
}
public void Prim( )
{
int i, j, k, x, y;
boolean[] Reached = new boolean[NNodes];
int[] predNode = new int[NNodes];
Reached[0] = true;
for ( k = 1; k < NNodes; k++ )
{
Reached[k] = false;
}
predNode[0] = 0;
printReachSet( Reached );
for (k = 1; k < NNodes; k++)
{
x = y = 0;
for ( i = 0; i < NNodes; i++ )
for ( j = 0; j < NNodes; j++ )
{
if ( Reached[i] && !Reached[j] &&
LinkCost[i][j] < LinkCost[x][y] )
{
x = i;
y = j;
}
}
System.out.println("Min cost edge: (" +
+ x + "," +
+ y + ")" +
"cost = " + LinkCost[x][y]);
predNode[y] = x;
Reached[y] = true;
printReachSet( Reached );
System.out.println();
}
int[] a= predNode;
for ( i = 0; i < NNodes; i++ )
System.out.println( a[i] + " --> " + i );
}
void printReachSet(boolean[] Reached )
{
System.out.print("ReachSet = ");
for (int i = 0; i < Reached.length; i++ )
if ( Reached[i] )
System.out.print( i + " ");
//System.out.println();
}
public static void main(String[] args)
{
int[][] conn = {{0,3,0,2,0,0,0,0,4}, // 0
{3,0,0,0,0,0,0,4,0}, // 1
{0,0,0,6,0,1,0,2,0}, // 2
{2,0,6,0,1,0,0,0,0}, // 3
{0,0,0,1,0,0,0,0,8}, // 4
{0,0,1,0,0,0,8,0,0}, // 5
{0,0,0,0,0,8,0,0,0}, // 6
{0,4,2,0,0,0,0,0,0}, // 7
{4,0,0,0,8,0,0,0,0} // 8
};
Graph G = new Graph(conn);
https://riptutorial.com/ 262
G.Prim();
}
}
Output:
$ java Graph
* 3 * 2 * * * * 4
3 * * * * * * 4 *
* * * 6 * 1 * 2 *
2 * 6 * 1 * * * *
* * * 1 * * * * 8
* * 1 * * * 8 * *
* * * * * 8 * * *
* 4 2 * * * * * *
4 * * * 8 * * * *
ReachSet = 0 Min cost edge: (0,3)cost = 2
ReachSet = 0 3
Min cost edge: (3,4)cost = 1
ReachSet = 0 3 4
Min cost edge: (0,1)cost = 3
ReachSet = 0 1 3 4
Min cost edge: (0,8)cost = 4
ReachSet = 0 1 3 4 8
Min cost edge: (1,7)cost = 4
ReachSet = 0 1 3 4 7 8
Min cost edge: (7,2)cost = 2
ReachSet = 0 1 2 3 4 7 8
Min cost edge: (2,5)cost = 1
ReachSet = 0 1 2 3 4 5 7 8
Min cost edge: (5,6)cost = 8
ReachSet = 0 1 2 3 4 5 6 7 8
0 --> 0
0 --> 1
7 --> 2
0 --> 3
3 --> 4
2 --> 5
5 --> 6
1 --> 7
0 --> 8
https://riptutorial.com/ 263
Chapter 54: Pseudocode
Remarks
Pseudocode is by definition informal. This topic is meant to describe ways to translate language-
specific code into something everyone with a programming background can understand.
Pseudocode is an important way to describe an algorithm and is more neutral than giving a
langugage-specific implementation. Wikipedia often uses some form of pseudocode when
describing an algorithm
Some things, like if-else type conditions are quite easy to write down informally. But other things,
js-style callbacks for instance, may be hard to turn into pseudocode for some people.
Examples
Variable affectations
Typed
int a = 1
int a := 1
let int a = 1
int a <- 1
No type
a = 1
a := 1
let a = 1
a <- 1
Functions
As long as the function name, return statement and parameters are clear, you're fine.
def incr n
return n + 1
or
https://riptutorial.com/ 264
let incr(n) = n + 1
or
are all quite clear, so you may use them. Try not to be ambiguous with a variable affectation
https://riptutorial.com/ 265
Chapter 55: Quicksort
Remarks
Sometimes Quicksort is also known as Partition-Exchange sort.
Auxiliary Space: O(n)
Time complexity: worst O(n²), bestO(nlogn)
Examples
Quicksort Basics
Quicksort is a sorting algorithm that picks an element ("the pivot") and reorders the array forming
two partitions such that all elements less than the pivot come before it and all elements greater
come after. The algorithm is then applied recursively to the partitions until the list is sorted.
This scheme chooses a pivot which is typically the last element in the array. The algorithm
maintains the index to put the pivot in variable i and each time it finds an element less than or
equal to pivot, this index is incremented and that element would be placed before the pivot.
https://riptutorial.com/ 266
2. Hoare partition scheme:
It uses two indices that start at the ends of the array being partitioned, then move toward each
other, until they detect an inversion: a pair of elements, one greater or equal than the pivot, one
lesser or equal, that are in the wrong order relative to each other. The inverted elements are then
swapped. When the indices meet, the algorithm stops and returns the final index. Hoare's scheme
is more efficient than Lomuto's partition scheme because it does three times fewer swaps on
average, and it creates efficient partitions even when all values are equal.
Partition :
https://riptutorial.com/ 267
j := hi + 1
loop forever
do:
i := i + 1
while A[i] < pivot do
do:
j := j - 1
while A[j] > pivot do
if i >= j then
return j
C# Implementation
https://riptutorial.com/ 268
}
Haskell Implementation
return i;
}
Quicksort in Python
def quicksort(arr):
https://riptutorial.com/ 269
if len(arr) <= 1:
return arr
pivot = arr[len(arr) / 2]
left = [x for x in arr if x < pivot]
middle = [x for x in arr if x == pivot]
right = [x for x in arr if x > pivot]
return quicksort(left) + middle + quicksort(right)
print quicksort([3,6,8,10,1,2,1])
https://riptutorial.com/ 270
Chapter 56: Radix Sort
Examples
Radix Sort Basic Information
Radix Sort is lower bound comparison based algorithm. It is a non-comparative integer sorting
algorithm that sorts data with integer keys by grouping keys by individual digits which share some
significant position and value. Radix sort is a linear time sorting algorithm that sort in O(n+k) time
when elements are in range from 1 to k. The idea of Radix Sort is to do digit by digit sort starting
from least significant digit to most significant digit. Radix sort uses counting sort as a subroutine to
sort. Radix sort is generalization of bucket sort.
https://riptutorial.com/ 271
Chapter 57: Searching
Examples
Binary Search
Introduction
Binary Search is a Divide and Conquer search algorithm. It uses O(log n) time to find the location
of an element in a search space where n is the size of the search space.
Binary Search works by halving the search space at each iteration after comparing the target
value to the middle value of the search space.
To use Binary Search, the search space must be ordered (sorted) in some way. Duplicate entries
(ones that compare as equal according to the comparison function) cannot be distinguished,
though they don't violate the Binary Search property.
Conventionally, we use less than (<) as the comparison function. If a < b, it will return true. if a is
not less than b and b is not less than a, a and b are equal.
Example Question
You are an economist, a pretty bad one though. You are given the task of finding the equilibrium
price (that is, the price where supply = demand) for rice.
Remember the higher a price is set, the larger the supply and the lesser the demand
As your company is very efficient at calculating market forces, you can instantly get the supply and
demand in units of rice when the price of rice is set at a certain price p.
Your boss wants the equilibrium price ASAP, but tells you that the equilibrium price can be a
positive integer that is at most 10^17 and there is guaranteed to be exactly 1 positive integer
solution in the range. So get going with your job before you lose it!
You are allowed to call functions getSupply(k) and getDemand(k), which will do exactly what is
stated in the problem.
Example Explanation
Here our search space is from 1 to 10^17. Thus a linear search is infeasible.
https://riptutorial.com/ 272
However, notice that as the k goes up, getSupply(k) increases and getDemand(k) decreases. Thus,
for any x > y, getSupply(x) - getDemand(x) > getSupply(y) - getDemand(y). Therefore, this search
space is monotonic and we can use Binary Search.
This algorithm runs in ~O(log 10^17) time. This can be generalized to ~O(log S) time where S is the
size of the search space since at every iteration of the while loop, we halved the search space (
from [low:high] to either [low:mid] or [mid:high]).
if (x == a[mid]) {
return (mid);
} else
if (x < a[mid]) {
binsearch(a, x, low, mid - 1);
} else {
binsearch(a, x, mid + 1, high);
}
}
low = 0;
high = N -1;
while(low < high)
https://riptutorial.com/ 273
{
mid = (low + high)/2;
if(array[mid] < x)
low = mid + 1;
else
high = mid;
}
if(array[low] == x)
// found, index is low
else
// not found
Do not attempt to return early by comparing array[mid] to x for equality. The extra comparison can
only slow the code down. Note you need to add one to low to avoid becoming trapped by integer
division always rounding down.
Interestingly, the above version of binary search allows you to find the smallest occurrence of x in
the array. If the array contains duplicates of x, the algorithm can be modified slightly in order for it
to return the largest occurrence of x by simply adding to the if conditional:
Note that instead of doing mid = (low + high) / 2, it may also be a good idea to try mid = low +
((high - low) / 2) for implementations such as Java implementations to lower the risk of getting
an overflow for really large inputs.
Linear search
Linear search is a simple algorithm. It loops through items until the query has been found, which
makes it a linear algorithm - the complexity is O(n), where n is the number of items to go through.
Why O(n)? In worst-case scenario, you have to go through all of the n items.
It can be compared to looking for a book in a stack of books - you go through them all until you
find the one that you want.
https://riptutorial.com/ 274
Rabin Karp
The Rabin–Karp algorithm or Karp–Rabin algorithm is a string searching algorithm that uses
hashing to find any one of a set of pattern strings in a text.Its average and best case running time
is O(n+m) in space O(p), but its worst-case time is O(nm) where n is the length of the text and m is
the length of the pattern.
While calculating hash value we are dividing it by a prime number in order to avoid collision.After
dividing by prime number the chances of collision will be less, but still ther is a chance that the
hash value can be same for two strings,so when we get a match we have to check it character by
character to make sure that we got a proper match.
This is to recalculate the hash value for pattern,first by removing the left most character and then
https://riptutorial.com/ 275
adding the new character from the text.
1. Worst Case
2. Average Case
3. Best Case
#include <stdio.h>
// otherwise return -1
int search(int arr[], int n, int x)
{
int i;
for (i=0; i<n; i++)
{
if (arr[i] == x)
return i;
}
return -1;
}
int main()
{
int arr[] = {1, 10, 30, 15};
int x = 30;
int n = sizeof(arr)/sizeof(arr[0]);
printf("%d is present at index %d", x, search(arr, n, x));
getchar();
return 0;
}
In the worst case analysis, we calculate upper bound on running time of an algorithm. We must
know the case that causes maximum number of operations to be executed. For Linear Search, the
worst case happens when the element to be searched (x in the above code) is not present in the
array. When x is not present, the search() functions compares it with all the elements of arr[] one
by one. Therefore, the worst case time complexity of linear search would be Θ(n)
In average case analysis, we take all possible inputs and calculate computing time for all of the
inputs. Sum all the calculated values and divide the sum by total number of inputs. We must know
https://riptutorial.com/ 276
(or predict) distribution of cases. For the linear search problem, let us assume that all cases are
uniformly distributed (including the case of x not being present in array). So we sum all the cases
and divide the sum by (n+1). Following is the value of average case time complexity.
In the best case analysis, we calculate lower bound on running time of an algorithm. We must
know the case that causes minimum number of operations to be executed. In the linear search
problem, the best case occurs when x is present at the first location. The number of operations in
the best case is constant (not dependent on n). So time complexity in the best case would be Θ(1)
Most of the times, we do worst case analysis to analyze algorithms. In the worst analysis, we
guarantee an upper bound on the running time of an algorithm which is good information. The
average case analysis is not easy to do in most of the practical cases and it is rarely done. In the
average case analysis, we must know (or predict) the mathematical distribution of all possible
inputs. The Best Case analysis is bogus. Guaranteeing a lower bound on an algorithm doesn’t
provide any information as in the worst case, an algorithm may take years to run.
For some algorithms, all the cases are asymptotically same, i.e., there are no worst and best
cases. For example, Merge Sort. Merge Sort does Θ(nLogn) operations in all cases. Most of the
other sorting algorithms have worst and best cases. For example, in the typical implementation of
Quick Sort (where pivot is chosen as a corner element), the worst occurs when the input array is
already sorted and the best occur when the pivot elements always divide array in two halves. For
insertion sort, the worst case occurs when the array is reverse sorted and the best case occurs
when the array is sorted in the same order as output.
https://riptutorial.com/ 277
Chapter 58: Selection Sort
Examples
Selection Sort Basic Information
Selection sort is a sorting algorithm, specifically an in-place comparison sort. It has O(n2) time
complexity, making it inefficient on large lists, and generally performs worse than the similar
insertion sort. Selection sort is noted for its simplicity, and it has performance advantages over
more complicated algorithms in certain situations, particularly where auxiliary memory is limited.
The algorithm divides the input list into two parts: the sublist of items already sorted, which is built
up from left to right at the front (left) of the list, and the sublist of items remaining to be sorted that
occupy the rest of the list. Initially, the sorted sublist is empty and the unsorted sublist is the entire
input list. The algorithm proceeds by finding the smallest (or largest, depending on sorting order)
element in the unsorted sublist, exchanging (swapping) it with the leftmost unsorted element
(putting it in sorted order), and moving the sublist boundaries one element to the right.
function select(list[1..n], k)
for i from 1 to k
minIndex = i
minValue = list[i]
for j from i+1 to n
if list[j] < minValue
minIndex = j
minValue = list[j]
swap list[i] and list[minIndex]
return list[k]
https://riptutorial.com/ 278
Example of Selection sort:
https://riptutorial.com/ 279
for (j = i + 1; j < n; j++)
{
if (input[j] < input[minId]) minId = j;
}
var temp = input[minId];
input[minId] = input[i];
input[i] = temp;
}
}
Elixir Implementation
defmodule Selection do
defp min([first|[second|[]]]) do
smaller(first, second)
end
defp min([first|[second|tail]]) do
min([smaller(first, second)|tail])
end
Selection.sort([100,4,10,6,9,3])
|> IO.inspect
https://riptutorial.com/ 280
Chapter 59: Shell Sort
Examples
Shell Sort Basic Information
Shell sort, also known as the diminishing increment sort, is one of the oldest sorting algorithms,
named after its inventor Donald. L. Shell (1959). It is fast, easy to understand and easy to
implement. However, its complexity analysis is a little more sophisticated.
Shell sort improves insertion sort. It starts by comparing elements far apart, then elements less far
apart, and finally comparing adjacent elements (effectively an insertion sort).
The effect is that the data sequence is partially sorted. The process above is repeated, but each
time with a narrower array, i.e. with a smaller number of columns. In the last step, the array
consists of only one column.
https://riptutorial.com/ 281
Pseudo code for Shell Sort:
input
foreach element in input
{
for(i = gap; i < n; i++)
{
temp = a[i]
for (j = i; j >= gap and a[j - gap] > temp; j -= gap)
{
a[j] = a[j - gap]
}
a[j] = temp
}
}
https://riptutorial.com/ 282
Auxiliary Space: O(n) total, O(1) auxiliary
Time Complexity: O(nlogn)
C# Implementation
https://riptutorial.com/ 283
Chapter 60: Shortest Common
Supersequence Problem
Examples
Shortest Common Supersequence Problem Basic Information
The Shortest Common Super Sequence is a problem closely related to the longest common
subsequence, which you can use as an external function for this task. The shortest common super
sequence problem is a problem closely related to the longest common subsequence problem.
For two input sequences, an scs can be formed from a longest common subsequence (lcs) easily.
For example, if X[1..m]=abcbdab and Y[1..n]=bdcaba, the lcs is Z[1..r]=bcba. By inserting the non-lcs
symbols while preserving the symbol order, we get the scs: U[1..t]=abdcabdab.
It is quite clear that r+t=m+n for two input sequences. However, for three or more input sequences
this does not hold. Note also, that the lcs and the scs problems are not dual problems.
For the more general problem of finding a string, S which is a superstring of a set of strings
S1,S2,...,Sl, the problem is NP-Complete . Also, good approximations can be found for the
average case but not for the worst case.
https://riptutorial.com/ 284
Time Complexity: O(max(m,n))
https://riptutorial.com/ 285
public static int Main(string x, string y)
{
return Scs(x, y);
}
}
https://riptutorial.com/ 286
Chapter 61: Sliding Window Algorithm
Examples
Sliding Window Algorithm Basic Information
Sliding window algorithm is used to perform required operation on specific window size of given
large buffer or array. Window starts from the 1st element and keeps shifting right by one element.
The objective is to find the minimum k numbers present in each window. This is commonly know
as Sliding window problem or algorithm.
For example to find the maximum or minimum element from every n element in given array, sliding
window algorithm is used.
Example:
Input Array: [1 3 -1 -3 5 3 6 7]
Window Size: 3
+---------------------------------+---------+
| Windows Position | Max |
+------------+----+---+---+---+---+---------+
|[1 3 -1]| -3 | 5 | 3 | 6 | 7 | 3 |
+------------+----+---+---+---+---+---------+
| 1 |[3 -1 -3]| 5 | 3 | 6 | 7 | 3 |
+---+-------------+---+---+---+---+---------+
| 1 | 3 |[-1 -3 5]| 3 | 6 | 7 | 5 |
+---+---+-------------+---+---+---+---------+
| 1 | 3 | -1 |[-3 5 3]| 6 | 7 | 5 |
+---+---+----+------------+---+---+---------+
| 1 | 3 | -1 | -3 |[5 3 6]| 7 | 6 |
+---+---+----+----+-----------+---+---------+
| 1 | 3 | -1 | -3 | 5 |[3 6 7]| 7 |
+---+---+----+----+---+-----------+---------+
+---------------------------------+---------+
| Windows Position | Min |
+------------+----+---+---+---+---+---------+
|[1 3 -1]| -3 | 5 | 3 | 6 | 7 | -1 |
+------------+----+---+---+---+---+---------+
| 1 |[3 -1 -3]| 5 | 3 | 6 | 7 | -3 |
+---+-------------+---+---+---+---+---------+
| 1 | 3 |[-1 -3 5]| 3 | 6 | 7 | -3 |
+---+---+-------------+---+---+---+---------+
| 1 | 3 | -1 |[-3 5 3]| 6 | 7 | -3 |
+---+---+----+------------+---+---+---------+
| 1 | 3 | -1 | -3 |[5 3 6]| 7 | 3 |
https://riptutorial.com/ 287
+---+---+----+----+-----------+---+---------+
| 1 | 3 | -1 | -3 | 5 |[3 6 7]| 3 |
+---+---+----+----+---+-----------+---------+
Method 1:
First way is to use quick sort, when pivot is at Kth position, all elements on the right side are
greater than pivot, hence, all elements on the left side automatically become K smallest elements
of given array.
Method 2:
Keep an array of K elements, Fill it with first K elements of given input array. Now from K+1
element, check if the current element is less than the maximum element in the auxiliary array, if
yes, add this element into array. Only problem with above solution is that we need to keep track of
maximum element. Still workable. How can we keep track of maximum element in set of integer?
Think heap. Think Max heap.
Method 3:
Great! In O(1) we would get the max element among K elements already chose as smallest K
elements . If max in current set is greater than newly considered element, we need to remove max
and introduce new element in set of K smallest element. Heapify again to maintain the heap
property. Now we can easily get K minimum elements in array of N.
https://riptutorial.com/ 288
for (int i = 0; i <= input.Length - k; i++)
{
var min = input[i];
for (int j = 1; j < k; j++)
{
if (input[i + j] < min) min = input[i + j];
}
result[i] = min;
}
return result;
}
https://riptutorial.com/ 289
Chapter 62: Sorting
Parameters
Parameter Description
Best case A sorting algorithm has a best case time complexity of O(T(n)) if its
complexity running time is at least T(n) for all possible inputs.
Average case A sorting algorithm has an average case time complexity of O(T(n)) if its
complexity running time, averaged over all possible inputs, is T(n).
Worst case A sorting algorithm has a worst case time complexity of O(T(n)) if its
complexity running time is at most T(n).
Examples
Stability in Sorting
Stability in sorting means whether a sort algorithm maintains the relative order of the equals keys
of the original input in the result output.
So a sorting algorithm is said to be stable if two objects with equal keys appear in the same order
in sorted output as they appear in the input unsorted array.
Now we will sort the list using the first element of each pair.
https://riptutorial.com/ 290
(1, 2) (3, 4) (8, 6) (9, 3) (9, 7)
Unstable sort may generate the same output as the stable sort but not always.
• Merge sort
• Insertion sort
• Radix sort
• Tim sort
• Bubble Sort
• Heap sort
• Quick sort
https://riptutorial.com/ 291
Chapter 63: Substring Search
Examples
KMP Algorithm in C
Given a text txt and a pattern pat, the objective of this program will be to print all the occurance of
pat in txt.
Examples:
Input:
output:
Input:
txt[] = "AABAACAADAABAAABAA"
pat[] = "AABA"
output:
C Language Implementation:
https://riptutorial.com/ 292
// Preprocess the pattern (calculate lps[] array)
computeLPSArray(pat, M, lps);
if (j == M)
{
printf("Found pattern at index %d \n", i-j);
j = lps[j-1];
}
https://riptutorial.com/ 293
{
lps[i] = 0;
i++;
}
}
}
}
Output:
Reference:
http://www.geeksforgeeks.org/searching-for-patterns-set-2-kmp-algorithm/
Rabin-Karp Algorithm is a string searching algorithm created by Richard M. Karp and Michael O.
Rabin that uses hashing to find any one of a set of pattern strings in a text.
A substring of a string is another string that occurs in. For example, ver is a substring of
stackoverflow. Not to be confused with subsequence because cover is a subsequence of the
same string. In other words, any subset of consecutive letters in a string is a substring of the given
string.
In Rabin-Karp algorithm, we'll generate a hash of our pattern that we are looking for & check if the
rolling hash of our text matches the pattern or not. If it doesn't match, we can guarantee that the
pattern doesn't exist in the text. However, if it does match, the pattern can be present in the text.
Let's look at an example:
Let's say we have a text: yeminsajid and we want to find out if the pattern nsa exists in the text.
To calculate the hash and rolling hash, we'll need to use a prime number. This can be any prime
number. Let's take prime = 11 for this example. We'll determine hash value using this formula:
(1st letter) X (prime) + (2nd letter) X (prime)¹ + (3rd letter) X (prime)² X + ......
We'll denote:
https://riptutorial.com/ 294
e -> 5 k -> 11 q -> 17 w -> 23
f -> 6 l -> 12 r -> 18 x -> 24
Now we find the rolling-hash of our text. If the rolling hash matches with the hash value of our
pattern, we'll check if the strings match or not. Since our pattern has 3 letters, we'll take 1st 3
letters yem from our text and calculate hash value. We get:
This value doesn't match with our pattern's hash value. So the string doesn't exists here. Now we
need to consider the next step. To calculate the hash value of our next string emi. We can
calculate this using our formula. But that would be rather trivial and cost us more. Instead, we use
another technique.
• We subtract the value of the First Letter of Previous String from our current hash value. In
this case, y. We get, 1653 - 25 = 1628.
• We divide the difference with our prime, which is 11 for this example. We get, 1628 / 11 =
148.
• We add new letter X (prime)⁻¹, where m is the length of the pattern, with the quotient, which
is i = 9. We get, 148 + 9 X 11² = 1237.
The new hash value is not equal to our patterns hash value. Moving on, for n we get:
https://riptutorial.com/ 295
2462 - 9 = 2453
2453 / 11 = 223
223 + 1 X 11² = 344
It's a match! Now we compare our pattern with the current string. Since both the strings match, the
substring exists in this string. And we return the starting position of our substring.
Hash Calculation:
Hash Recalculation:
String Match:
Rabin-Karp:
https://riptutorial.com/ 296
This algorithm is used in detecting plagiarism. Given source material, the algorithm can rapidly
search through a paper for instances of sentences from the source material, ignoring details such
as case and punctuation. Because of the abundance of the sought strings, single-string searching
algorithms are impractical here. Again, Knuth-Morris-Pratt algorithm or Boyer-Moore String
Search algorithm is faster single pattern string searching algorithm, than Rabin-Karp. However,
it is an algorithm of choice for multiple pattern search. If we want to find any of the large number,
say k, fixed length patterns in a text, we can create a simple variant of the Rabin-Karp algorithm.
For text of length n and p patterns of combined length m, its average and best case running time
is O(n+m) in space O(p), but its worst-case time is O(nm).
Suppose that we have a text and a pattern. We need to determine if the pattern exists in the text
or not. For example:
+-------+---+---+---+---+---+---+---+---+
| Index | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
+-------+---+---+---+---+---+---+---+---+
| Text | a | b | c | b | c | g | l | x |
+-------+---+---+---+---+---+---+---+---+
+---------+---+---+---+---+
| Index | 0 | 1 | 2 | 3 |
+---------+---+---+---+---+
| Pattern | b | c | g | l |
+---------+---+---+---+---+
This pattern does exist in the text. So our substring search should return 3, the index of the
position from which this pattern starts. So how does our brute force substring search procedure
work?
What we usually do is: we start from the 0th index of the text and the 0th index of our *pattern and
we compare Text[0] with Pattern[0]. Since they are not a match, we go to the next index of our
text and we compare Text[1] with Pattern[0]. Since this is a match, we increment the index of our
pattern and the index of the Text also. We compare Text[2] with Pattern[1]. They are also a
match. Following the same procedure stated before, we now compare Text[3] with Pattern[2]. As
they do not match, we start from the next position where we started finding the match. That is
index 2 of the Text. We compare Text[2] with Pattern[0]. They don't match. Then incrementing
index of the Text, we compare Text[3] with Pattern[0]. They match. Again Text[4] and Pattern[1]
match, Text[5] and Pattern[2] match and Text[6] and Pattern[3] match. Since we've reached the
end of our Pattern, we now return the index from which our match started, that is 3. If our pattern
was: bcgll, that means if the pattern didn't exist in our text, our search should return exception or -
1 or any other predefined value. We can clearly see that, in the worst case, this algorithm would
take O(mn) time where m is the length of the Text and n is the length of the Pattern. How do we
reduce this time complexity? This is where KMP Substring Search Algorithm comes into the
picture.
The Knuth-Morris-Pratt String Searching Algorithm or KMP Algorithm searches for occurrences of
a "Pattern" within a main "Text" by employing the observation that when a mismatch occurs, the
https://riptutorial.com/ 297
word itself embodies sufficient information to determine where the next match could begin, thus
bypassing re-examination of previously matched characters. The algorithm was conceived in 1970
by Donuld Knuth and Vaughan Pratt and independently by James H. Morris. The trio published it
jointly in 1977.
Let's extend our example Text and Pattern for better understanding:
+-------+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| Index |0 |1 |2 |3 |4 |5 |6 |7 |8 |9 |10|11|12|13|14|15|16|17|18|19|20|21|22|
+-------+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| Text |a |b |c |x |a |b |c |d |a |b |x |a |b |c |d |a |b |c |d |a |b |c |y |
+-------+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
+---------+---+---+---+---+---+---+---+---+
| Index | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
+---------+---+---+---+---+---+---+---+---+
| Pattern | a | b | c | d | a | b | c | y |
+---------+---+---+---+---+---+---+---+---+
At first, our Text and Pattern matches till index 2. Text[3] and Pattern[3] doesn't match. So our
aim is to not go backwards in this Text, that is, in case of a mismatch, we don't want our matching
to begin again from the position that we started matching with. To achieve that, we'll look for a
suffix in our Pattern right before our mismatch occurred (substring abc), which is also a prefix of
the substring of our Pattern. For our example, since all the characters are unique, there is no
suffix, that is the prefix of our matched substring. So what that means is, our next comparison will
start from index 0. Hold on for a bit, you'll understand why we did this. Next, we compare Text[3]
with Pattern[0] and it doesn't match. After that, for Text from index 4 to index 9 and for Pattern
from index 0 to index 5, we find a match. We find a mismatch in Text[10] and Pattern[6]. So we
take the substring from Pattern right before the point where mismatch occurs (substring abcdabc),
we check for a suffix, that is also a prefix of this substring. We can see here ab is both the suffix
and prefix of this substring. What that means is, since we've matched until Text[10], the
characters right before the mismatch is ab. What we can infer from it is that since ab is also a
prefix of the substring we took, we don't have to check ab again and the next check can start from
Text[10] and Pattern[2]. We didn't have to look back to the whole Text, we can start directly from
where our mismatch occurred. Now we check Text[10] and Pattern[2], since it's a mismatch, and
the substring before mismatch (abc) doesn't contain a suffix which is also a prefix, we check
Text[10] and Pattern[0], they don't match. After that for Text from index 11 to index 17 and for
Pattern from index 0 to index 6. We find a mismatch in Text[18] and Pattern[7]. So again we
check the substring before mismatch (substring abcdabc) and find abc is both the suffix and the
prefix. So since we matched till Pattern[7], abc must be before Text[18]. That means, we don't
need to compare until Text[17] and our comparison will start from Text[18] and Pattern[3]. Thus
we will find a match and we'll return 15 which is our starting index of the match. This is how our
KMP Substring Search works using suffix and prefix information.
Now, how do we efficiently compute if suffix is same as prefix and at what point to start the check
if there is a mismatch of character between Text and Pattern. Let's take a look at an example:
+---------+---+---+---+---+---+---+---+---+
| Index | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
+---------+---+---+---+---+---+---+---+---+
https://riptutorial.com/ 298
| Pattern | a | b | c | d | a | b | c | a |
+---------+---+---+---+---+---+---+---+---+
We'll generate an array containing the required information. Let's call the array S. The size of the
array will be same as the length of the pattern. Since the first letter of the Pattern can't be the
suffix of any prefix, we'll put S[0] = 0. We take i = 1 and j = 0 at first. At each step we compare
Pattern[i] and Pattern[j] and increment i. If there is a match we put S[i] = j + 1 and increment j, if
there is a mismatch, we check the previous value position of j (if available) and set j = S[j-1] (if j is
not equal to 0), we keep doing this until S[j] doesn't match with S[i] or j doesn't become 0. For the
later one, we put S[i] = 0. For our example:
j i
+---------+---+---+---+---+---+---+---+---+
| Index | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
+---------+---+---+---+---+---+---+---+---+
| Pattern | a | b | c | d | a | b | c | a |
+---------+---+---+---+---+---+---+---+---+
Pattern[j] and Pattern[i] don't match, so we increment i and since j is 0, we don't check the
previous value and put Pattern[i] = 0. If we keep incrementing i, for i = 4, we'll get a match, so we
put S[i] = S[4] = j + 1 = 0 + 1 = 1 and increment j and i. Our array will look like:
j i
+---------+---+---+---+---+---+---+---+---+
| Index | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
+---------+---+---+---+---+---+---+---+---+
| Pattern | a | b | c | d | a | b | c | a |
+---------+---+---+---+---+---+---+---+---+
| S | 0 | 0 | 0 | 0 | 1 | | | |
+---------+---+---+---+---+---+---+---+---+
+---------+---+---+---+---+---+---+---+---+
| S | 0 | 0 | 0 | 0 | 1 | 2 | 3 | 1 |
+---------+---+---+---+---+---+---+---+---+
This is our required array. Here a nonzero-value of S[i] means there is a S[i] length suffix same as
the prefix in that substring (substring from 0 to i) and the next comparison will start from S[i] + 1
position of the Pattern. Our algorithm to generate the array would look like:
Procedure GenerateSuffixArray(Pattern):
i := 1
j := 0
n := Pattern.length
while i is less than n
if Pattern[i] is equal to Pattern[j]
S[i] := j + 1
j := j + 1
https://riptutorial.com/ 299
i := i + 1
else
if j is not equal to 0
j := S[j-1]
else
S[i] := 0
i := i + 1
end if
end if
end while
The time complexity to build this array is O(n) and the space complexity is also O(n). To make sure
if you have completely understood the algorithm, try to generate an array for pattern aabaabaa and
check if the result matches with this one.
+---------+---+---+---+---+---+---+---+---+---+---+---+---+
| Index | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |10 |11 |
+---------+---+---+---+---+---+---+---+---+---+---+---+---+
| Text | a | b | x | a | b | c | a | b | c | a | b | y |
+---------+---+---+---+---+---+---+---+---+---+---+---+---+
+---------+---+---+---+---+---+---+
| Index | 0 | 1 | 2 | 3 | 4 | 5 |
+---------+---+---+---+---+---+---+
| Pattern | a | b | c | a | b | y |
+---------+---+---+---+---+---+---+
| S | 0 | 0 | 0 | 1 | 2 | 0 |
+---------+---+---+---+---+---+---+
We have a Text, a Pattern and a pre-calculated array S using our logic defined before. We
compare Text[0] and Pattern[0] and they are same. Text[1] and Pattern[1] are same. Text[2]
and Pattern[2] are not same. We check the value at the position right before the mismatch. Since
S[1] is 0, there is no suffix that is same as the prefix in our substring and our comparison starts at
position S[1], which is 0. So Pattern[0] is not same as Text[2], so we move on. Text[3] is same
as Pattern[0] and there is a match till Text[8] and Pattern[5]. We go one step back in the S array
and find 2. So this means there is a prefix of length 2 which is also the suffix of this substring (
abcab) which is ab. That also means that there is an ab before Text[8]. So we can safely ignore
Pattern[0] and Pattern[1] and start our next comparison from Pattern[2] and Text[8]. If we
continue, we'll find the Pattern in the Text. Our procedure will look like:
https://riptutorial.com/ 300
if j is not equal to 0
j = S[j-1]
else
i := i + 1
end if
end if
end while
Return -1
The time complexity of this algorithm apart from the Suffix Array Calculation is O(m). Since
GenerateSuffixArray takes O(n), the total time complexity of KMP Algorithm is: O(m+n).
PS: If you want to find multiple occurrences of Pattern in the Text, instead of returning the value,
print it/store it and set j := S[j-1]. Also keep a flag to track whether you have found any
occurrence or not and handle it accordingly.
Time complexity: Search portion (strstr method) has the complexity O(n) where n is the length of
haystack but as needle is also pre parsed for building prefix table O(m) is required for building
prefix table where m is the length of the needle.
Therefore, overall time complexity for KMP is O(n+m)
Space complexity: O(m) because of prefix table on needle.
Note: Following implementation returns the start position of match in haystack (if there is a match)
else returns -1, for edge cases like if needle/haystack is an empty string or needle is not found in
haystack.
def get_prefix_table(needle):
prefix_set = set()
n = len(needle)
prefix_table = [0]*n
delimeter = 1
while(delimeter<n):
prefix_set.add(needle[:delimeter])
j = 1
while(j<delimeter+1):
if needle[j:delimeter+1] in prefix_set:
prefix_table[delimeter] = delimeter - j + 1
break
j += 1
delimeter += 1
return prefix_table
https://riptutorial.com/ 301
prefix_table = get_prefix_table(needle)
m = i = 0
while((i<needle_len) and (m<haystack_len)):
if haystack[m] == needle[i]:
i += 1
m += 1
else:
if i != 0:
i = prefix_table[i-1]
else:
m += 1
if i==needle_len and haystack[m-1] == needle[i-1]:
return m - needle_len
else:
return -1
if __name__ == '__main__':
needle = 'abcaby'
haystack = 'abxabcabcaby'
print strstr(haystack, needle)
https://riptutorial.com/ 302
Chapter 64: Travelling Salesman
Remarks
The Travelling Salesman Problem is the problem of finding the minimum cost of travelling through
N vertices exactly once per vertex. There is a cost cost[i][j] to travel from vertex i to vertex j.
There are 2 types of algorithms to solve this problem: Exact Algorithms and Approximation
Algorithms
Exact Algorithms
Approximation Algorithms
To be added
Examples
Brute Force Algorithm
A path through every vertex exactly once is the same as ordering the vertex in some way. Thus, to
calculate the minimum cost of travelling through every vertex exactly once, we can brute force
every single one of the N! permutations of the numbers from 1 to N.
Psuedocode
minimum = INF
for all permutations P
current = 0
current = current + cost[P[N-1]][P[0]] <- Add the cost of going from last vertex to
the first
output minimum
Time Complexity
There are N! permutations to go through and the cost of each path is calculated in O(N), thus this
algorithm takes O(N * N!) time to output the exact answer.
https://riptutorial.com/ 303
Dynamic Programming Algorithm
(1,2,3,4,6,0,5,7)
(1,2,3,5,0,6,7,4)
The cost of going from vertex 1 to vertex 2 to vertex 3 remains the same, so why must it be
recalculated? This result can be saved for later use.
Let dp[bitmask][vertex] represent the minimum cost of travelling through all the vertices whose
corresponding bit in bitmask is set to 1 ending at vertex. For example:
dp[12][2]
12 = 1 1 0 0
^ ^
vertices: 3 2 1 0
Since 12 represents 1100 in binary, dp[12][2] represents going through vertices 2 and 3 in the graph
with the path ending at vertex 2.
Here, bitmask | (1 << i) sets the ith bit of bitmask to 1, which represents that the ith vertex has
https://riptutorial.com/ 304
been visited. The i after the comma represents the new pos in that function call, which represents
the new "last" vertex. cost[pos][i] is to add the cost of travelling from vertex pos to vertex i.
Thus, this line is to update the value of cost to the minimum possible value of travelling to every
other vertex that has not been visited yet.
Time Complexity
The function TSP(bitmask,pos) has 2^N values for bitmask and N values for pos. Each function takes
O(N) time to run (the for loop). Thus this implementation takes O(N^2 * 2^N) time to output the
exact answer.
https://riptutorial.com/ 305
Chapter 65: Trees
Remarks
Trees are a sub-category or sub-type of node-edge graphs. They are ubiquitous within computer
science because of their prevalence as a model for many different algorithmic structures that are,
in turn, applied in many different algorithms
Examples
Introduction
Trees are a sub-type of the more general node-edge graph data structure.
The tree data structure is quite common within computer science. Trees are used to model many
https://riptutorial.com/ 306
different algorithmic data structures, such as ordinary binary trees, red-black trees, B-trees, AB-
trees, 23-trees, Heap, and tries.
Typically we represent an anary tree (one with potentially unlimited children per node) as a binary
tree, (one with exactly two children per node). The "next" child is regarded as a sibling. Note that if
a tree is binary, this representation creates extra nodes.
We then iterate over the siblings and recurse down the children. As most trees are relatively
shallow - lots of children but only a few levels of hierarchy, this gives rise to efficient code. Note
human genealogies are an exception (lots of levels of ancestors, only a few children per level).
If necessary back pointers can be kept to allow the tree to be ascended. These are more difficult
to maintain.
Note that it is typical to have one function to call on the root and a recursive function with extra
parameters, in this case tree depth.
struct node
{
struct node *next;
struct node *child;
std::string data;
}
while(node)
{
if(node->child)
{
for(i=0;i<depth*3;i++)
printf(" ");
printf("{\n"):
printtree_r(node->child, depth +1);
for(i=0;i<depth*3;i++)
printf(" ");
printf("{\n"):
for(i=0;i<depth*3;i++)
printf(" ");
printf("%s\n", node->data.c_str());
https://riptutorial.com/ 307
node = node->next;
}
}
}
Example:1
a)
b)
Example:2
a)
https://riptutorial.com/ 308
b)
if(root1->data == root2->data
&& sameTree(root1->left,root2->left)
&& sameTree(root1->right, root2->right))
return true;
https://riptutorial.com/ 309
Credits
S.
Chapters Contributors
No
A* Pathfinding
3 TajyMany
Algorithm
Applications of
6 Dynamic Chris, user2314737
Programming
Applications of
7 EsmaeelE, goeddek, Tejus Prasad, user2314737
Greedy technique
Bellman–Ford
8 Bakhtiar Hasan, Sumeet Singh, user2314737, Yerken
Algorithm
Binary Tree
11 Isha Agarwal
traversals
12 Breadth-First Search Bakhtiar Hasan, mnoronha, Sumeet Singh, Zubayet Zaman Zico
https://riptutorial.com/ 310
Catalan Number
15 Keyur Ramoliya, mnoronha
Algorithm
Check if a tree is
16 Isha Agarwal, Janaky Murthy
BST or not
Dynamic Time
23 Bakhtiar Hasan, mnoronha, Zubayet Zaman Zico
Warping
Edit Distance
24 Vishwas
Dynamic Algorithm
Fast Fourier
26 Dr. ABT, EsmaeelE
Transform
Floyd-Warshall
27 Bakhtiar Hasan, Sayakiss
Algorithm
https://riptutorial.com/ 311
Algorithm
Longest Common
39 Bakhtiar Hasan, Keyur Ramoliya
Subsequence
Longest Increasing
40 Keyur Ramoliya, mnoronha
Subsequence
Lowest common
41 ancestor of a Binary Isha Agarwal
Tree
Matrix
42 Bakhtiar Hasan, mnoronha
Exponentiation
Maximum Subarray
44 Keyur Ramoliya, mnoronha
Algorithm
Multithreaded
46 Julien Rousé
Algorithms
polynomial-time
bounded algorithm
52 Alber Tadrous
for Minimum Vertex
Cover
https://riptutorial.com/ 312
53 Prim's Algorithm Bakhtiar Hasan, Tejus Prasad
54 Pseudocode Community
Shortest Common
60 Supersequence Keyur Ramoliya
Problem
Sliding Window
61 Keyur Ramoliya
Algorithm
https://riptutorial.com/ 313