cd3291 Dsa Study Material
cd3291 Dsa Study Material
cd3291 Dsa Study Material
STUDY MATERIAL
1
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
TOTAL: 45 HOURS
COURSE OUTCOMES:
At the end of the course, the student should be able to:
Explain abstract data types.
Design, implement, and analyse linear data structures, such as lists, queues, and
stacks, according to the needs of different applications.
Design, implement, and analyse efficient tree structures to meet requirements such
as searching, indexing, and sorting.
Model problems as graph problems and implement efficient graph algorithms to solve
them.
TEXT BOOKS:
1. Michael T. Goodrich, Roberto Tamassia, and Michael H. Goldwasser, “Data Structures
and Algorithms in Python” (An Indian Adaptation), Wiley, 2021.
2. Lee, Kent D., Hubbard, Steve, “Data Structures and Algorithms with Python” Springer
Edition 2015.
3. Narasimha Karumanchi, “Data Structures and Algorithmic Thinking with Python”
Careermonk, 2015.
2
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Variables
Ex: x2+2y-2=1
Data Types
The number of bits allocated for each primitive data type depends on the programming
languages, the compiler and the operating system.
For the same primitive data type, different languages may use different sizes.
Depending on the size of the data types, the total available values (domain) will also change.
For example, “int” may take 2 bytes or 4 bytes. If it takes 2 bytes (16 bits), then the
total possible values are minus 32,768 to plus 32,767 (-215 to 215-1). If it takes 4 bytes (32
bits), then the possible values are between -2,147,483,648 and +2,147,483,647 (-231 to 231-
1). The same is the case with other data types.
3
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
If the system-defined data types are not enough, then most programming languages
allow the users to define their own data types, called user – defined data types.
Good examples of user defined data types are: structures in C/C + + and classes in
Java.
For example, in the snippet below, we are combining many system-defined data types
and calling the user defined data type by the name “newType”.
This gives more flexibility and comfort in dealing with computer memory.
struct newType
{
int data1;
float data2;
.
.
.
char datan;
};
Data Structures
Data structure is a particular way of storing and organizing data in a computer so that
it can be used efficiently.
A data structure is a special format for organizing and storing data. General data
structure types include arrays, files, linked lists, stacks, queues, trees, graphs and so on.
Depending on the organization of the elements, data structures are classified into two types:
1) Linear data structures: Elements are accessed in a sequential order but it is not
compulsory to store all elements sequentially. Examples: Linked Lists, Stacks and Queues.
2) Non – linear data structures: Elements of this data structure are stored/accessed
in a non-linear order. Examples: Trees and graphs.
4
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
To simplify the process of solving problems, we combine the data structures with their
operations and we call this Abstract Data Types (ADTs). An ADT consists of two parts:
1. Declaration of data
2. Declaration of operations
Commonly used ADTs include: Linked Lists, Stacks, Queues, Priority Queues, Binary
Trees, Dictionaries, Disjoint Sets (Union and Find), Hash Tables, Graphs, and many others.
Robustness
A program produces the right output for all the anticipated inputs in the program’s
application. In addition, we want software to be robust, that is, capable of handling
unexpected inputs that are not explicitly defined for its application.
Adaptability
Software, therefore, needs to be able to evolve over time in response to changing
conditions in its environment. Thus, another important goal of quality software is that it
achieves adaptability (also called evolvability).
Related to this concept is portability, which is the ability of software to run with minimal
change on different hardware and operating system platforms. An advantage of writing
software in Python is the portability provided by the language itself.
Reusability
Developing quality software can be an expensive enterprise, and its cost can be offset
somewhat if the software is designed in a way that makes it easily reusable in future
applications.
Such reuse should be done with care, however, for one of the major sources of
software errors in the Therac-25 came from inappropriate reuse of Therac-20 software.
5
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Python’s standard libraries include, for example, the math module, which provides
definitions for key mathematical constants and functions, and the os module, which provides
support for interacting with the operating system.
Abstraction
Applying the abstraction paradigm to the design of data structures gives rise to abstract
data types (ADTs). An ADT is a mathematical model of a data structure that specifies the type
of data stored, the operations supported on them, and the types of parameters of the
operations, the collective set of behaviours supported by an ADT designed as public interface.
Python supports abstract data types using a mechanism known as an abstract base
class (ABC). An abstract base class cannot be instantiated (i.e., you cannot directly create an
instance of that class), but it defines one or more common methods that all implementations
of the abstraction must have.
An ABC is realized by one or more concrete classes that inherit from the abstract base
class while providing implementations for those method declared by the ABC.
Encapsulation
One of the main advantages of encapsulation is that it gives one programmer freedom
to implement the details of a component, without concern that other programmers will be
writing code that intricately depends on those internal decisions.
6
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Class Definitions
A class provides a set of behaviors in the form of member functions (also known as
methods), with implementations that are common to all instances of that class.
A class also serves as a blueprint for its instances, effectively determining the way
that state information for each instance is represented in the form of attributes (also known as
fields, instance variables, or data members)
Defining a Class:
Like function definitions begin with the def keyword in Python, class definitions begin
with a class keyword.
The first string inside the class is called docstring and has a brief description of the
class. Although not mandatory, this is highly recommended.
class MyNewClass:
'''This is a docstring. I have created a new class'''
Pass
A class creates a new local namespace where all its attributes are defined. Attributes
may be data or functions.
There are also special attributes in it that begins with double underscores . For
example, doc gives us the docstring of that class.
As soon as we define a class, a new class object is created with the same name. This
class object allows us to access the different attributes as well as to instantiate new objects of
that class.
Example:
class Person:
"This is a person class"
age = 10
def greet(self):
print('Hello')
print(Person.age)
print(Person.greet)
7
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
print(Person. doc )
Output:
10
<function Person.greet at 0x7fc78c6e8160>
This is a person class
We saw that the class object could be used to access different attributes. It can also
be used to create new object instances (instantiation) of that class. The procedure to create
an object is similar to a function call.
This will create a new object instance named harry. We can access the attributes of
objects using the object name prefix.
Example:
class CreditCard:
def init (self, customer, bank, acnt, limit):
self. customer = customer
self. bank = bank
self. account = acnt
self. limit = limit
self. balance = 0
def get customer(self):
return self. Customer
def get bank(sf):
return self. Bank
8
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
The Constructor
init method that serves as the constructor of the class. Its primary responsibility
is to establish the state of a newly created credit card object with appropriate instance
variables.
Encapsulation
A single leading underscore in the name of a data member, such as balance, implies
that it is intended as non-public. Users of a class should not directly access such members.
Additional Methods
The most interesting behaviours in our class are charge and make payment. The
charge function typically adds the given price to the credit card balance, to reflect a purchase
of said price by the customer.
Inheritance:
There are two ways in which a subclass can differentiate itself from its superclass. A
subclass may specialize an existing behaviour by providing a new implementation that
overrides an existing method. A subclass may also extend its superclass by providing brand
new methods.
Syntax:
class BaseClass:
Body of base class
class DerivedClass(BaseClass):
Body of derived class
Derived class inherits features from the base class where new features can be added to
it. This results in re-usability of code.
9
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Types of Inheritance
Depending upon the number of child and parent classes involved, there are four types
of inheritance in python.
Single Inheritance
When a child class inherits only a single parent class.
Example:
class Parent:
def func1(self):
print("this is function one")
class Child(Parent):
def func2(self):
print(" this is function 2 ")
ob = Child()
ob.func1()
ob.func2()
Multiple Inheritance
When a child class inherits from more than one parent class.
Example:
class Parent:
def func1(self):
print("this is function 1")
class Parent2:
def func2(self):
print("this is function 2")
class Child(Parent , Parent2):
def func3(self):
print("this is function 3")
ob = Child()
ob.func1()
ob.func2()
ob.func3()
10
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Multilevel Inheritance
When a child class becomes a parent class for another child class.
Example:
class Parent:
def func1(self):
print("this is function 1")
class Child(Parent):
def func2(self):
print("this is function 2")
class Child2(Child):
def func3("this is function 3")
ob = Child2()
ob.func1()
ob.func2()
ob.func3()
Hierarchical Inheritance
Hierarchical inheritance involves multiple inheritance from the same base or parent
class.
Example:
class Parent:
def func1(self):
print("this is function one")
class Child(Parent):
def func2(self):
print("this is function 2")
class Child1(Parent):
def func3(self):
print(" this is function 3"):
class Child3(Parent , Child1):
def func4(self):
print(" this is function 4")
ob = Child3()
ob.func1()
11
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
function call. Therefore, an assignment, x=5, within a function has no effect on the identifier,
x, in the broader scope. Each distinct scope in Python is represented using an abstraction
known as a namespace. A namespace manages all identifiers that are currently defined in a
given scope.
The process of determining the value associated with an identifier is known as name
resolution.
In a Python program, there are three types of namespaces:
1. Built-In
2. Global
3. Local
12
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
This namespace gets created when the interpreter starts. It stores all the keywords or
the built-in names. This is the superset of all the Namespaces. This is the reason we can use
print, True, etc. from any part of the code.
This is the namespace that holds all the global objects. This namespace gets created
when the program starts running and exists till the end of the execution.
Output:
PythonGeeks
PythonGeeks
This is the namespace that generally exists for some part of the time during the
execution of the program. This stores the names of those objects in a function. These
namespaces exist as long as the functions exist. This is the reason we cannot globally access
a variable, created inside a function.
Output:
Python
13
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Let's take an example where we create a list named old_list and pass an object
reference to new_list using = operator.
Example:
#Copy using = operator
old_list = [[1, 2, 3], [4, 5, 6], [7, 8, 'a']]
new_list = old_list
The output will be:
new_list[2][2] = 9
print('Old List:', old_list) ld List: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
print('ID of Old List:', id(old_list))
D of Old List: 140673303268168
print('New List:', new_list)
ew List: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
print('ID of New List:', id(new_list))
D of New List: 140673303268168
The output both variables old_list and new_list shares the same id i.e
140673303268168. So, the changes in new_list or old_list, will be visible in both.
Essentially, sometimes you may want to have the original values unchanged and only modify
the new values or vice versa. In Python, there are two ways to create copies:
Shallow Copy
Deep Copy
To make these copy work, copy module is used.
For example:
import copy
copy.copy(x)
copy.deepcopy(x)
Here, the copy() return a shallow copy of x. Similarly, deepcopy() return a deep copy of x.
14
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Shallow Copy
A shallow copy creates a new object which stores the reference of the original
elements. So, a shallow copy doesn't create a copy of nested objects, instead it just copies
the reference of nested objects. This means, a copy process does not recurse or create copies
of nested objects itself.
Example: Create a copy using shallow copy
import copy
old_list = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
new_list = copy.copy(old_list)
print("Old list:", old_list)
print("New list:", new_list)
The output will be:
Old list: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
New list: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
In the above program, shallow copy of old_list is created. The new_list contains
references to original nested objects stored in old_list. Then we add the new list i.e [4, 4, 4]
into old_list. This new sublist was not copied in new_list.
15
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
In the above program, changes to old_list i.e old_list[1][1] = 'AA' affects both sublists
old_list and new_list at index [1][1]. This is because, both lists share the reference of same
nested objects.
Deep Copy
A deep copy creates a new object and recursively adds the copies of nested objects
present in the original elements.
In the above program, changes to any nested objects in original object old_list, makes
changes to the copy new_list.
Example: Adding a new nested object in the list using Deep copy
import copy
old_list = [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
new_list = copy.deepcopy(old_list)
old_list[1][0] = 'BB'
print("Old list:", old_list)
print("New list:", new_list)
Output:
Old list: [[1, 1, 1], ['BB', 2, 2], [3, 3, 3]]
New list: [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
In the above program, changes in old_list, makes changes only in the old_list. This
means, both the old_list and the new_list are independent. This is because the old_list was
recursively copied, which is true for all its nested objects.
16
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Data structure is a systematic way of organizing and accessing data, and an algorithm
is a step-by-step procedure for performing some task in a finite amount of time. Algorithm
analysis helps us to determine which algorithm is most efficient in terms of time and space
consumed.
Experimental running times of two algorithms are difficult to directly compare unless
the experiments are performed in the same hardware and software environments.
Experiments can be done only on a limited set of test inputs; hence, they leave out
the running times of inputs not included in the experiment (and these inputs may be
important).
An algorithm must be fully implemented in order to execute it to study its running time
experimentally.
1. Allows us to evaluate the relative efficiency of any two algorithms in a way that is
independent of the hardware and software environment.
Types of Analysis
Algorithm analysis depends on which inputs the algorithm takes less time
(performing wel1) and with which inputs the algorithm takes a long time.
Worst case
Defines the input for which the algorithm takes a long time (slowest time to
complete).
Input is the one for which the algorithm runs the slowest.
17
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Best case
Defines the input for which the algorithm takes the least time (fastest time to
complete).
Input is the one for which the algorithm runs the fastest.
Average case
Provides a prediction about the running time of the algorithm.
Run the algorithm many times, using many different inputs and divide by the
number of trials.
Assumes that the input is random.
Asymptotic Notation
For the best, average and worst cases, we need to identify the upper and lower
bounds. To represent these upper and lower bounds, we need some kind of syntax,
represented in the form of function f(n).
This notation gives the tight upper bound of the given function. Thus, it gives the
worst-case complexity of an algorithm.
For example, if f(n) = n 4 + 100n 2 + 10n + 50 is the given algorithm, then n 4 is g(n).
That means g(n) gives the maximum rate of growth for f(n) at larger values of n.
18
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Big-O Examples
This notation gives the tighter lower bound of the given algorithm and we represent it
as f(n) = Ω(g(n)). Thus, it provides the best-case complexity of an algorithm.
That means, at larger values of n, the tighter lower bound of f(n) is g(n). For example,
if f(n) = 100n2 + 10n + 50, g(n) is Ω(n2 ).
19
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Big-Theta
There is a notation that allows us to say that two functions grow at the same rate, up
to constant factors(ie., It encloses the function from above and below). Since it represents the
upper and the lower bound of the running time of an algorithm, it is used for sanalyzing the average-case
complexity of an algorithm.
f(n) is Θ(g(n)), pronounced “ f(n) is big-Theta of g(n),” if f(n) is O(g(n)) and f(n) is
Ω(g(n)) , that is, there are real constants c > 0 and c > 0, and an integer constant n0 ≥ 1 such
that
There are some general rules to help us determine the running time of an algorithm.
1) Loops: The running time of a loop is, at most, the running time of the statements inside the
loop (including tests) multiplied by the number of iterations.
20
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
2) Nested loops: Analyze from the inside out. Total running time is the product of the sizes of
all the loops.
Example:
//outer loop
For(i=1;i<=n;i++)
For(j=1;j<=n;j++)
K=k+1 //constant time, c
Total time = c × n × n = c n2 = O(n2 ).
Example:
X=x+1
For(i=1;i<=n;i++)
M=m+2 //constant time, c
//outer loop
For(i=1;i<=n;i++)
For(j=1;j<=n;j++)
K=k+1 //constant time, c
4) If-then-else statements:
Worst-case running time: the test, plus either the then part or the else part (whichever is the
larger).
//test : constant
If(length()==0)
Return false;
Else:
For(int n=0;n<length();n++)
If(!list[n].equals(otherList.list[n]))
Return false;
Total time = c0 + c1 + (c2 + c3 ) * n = O(n).
21
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Examples
The following program is an example of divide-and-conquer programming approach
where the binary search is implemented using python.
Binary Search implementation
In binary search we take a sorted list of elements and start looking for an element at
the middle of the list. If the search value matches with the middle value in the list we
complete the search. Otherwise we eleminate half of the list of elements by choosing
whether to procees with the right or left half of the list depending on the value of the item
searched.
This is possible as the list is sorted and it is much quicker than linear search.Here we
divide the given list and conquer by choosing the proper half of the list. We repeat this
approcah till we find the element or conclude about it's absence in the list.
22
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Example
def bsearch(list, val):
list_size = len(list) - 1
idx0 = 0
idxn = list_size
# Find the middle most value
while idx0 <= idxn:
midval = (idx0 + idxn)// 2
if list[midval] == val:
return midval
# Compare the value the middle most value
if val > list[midval]:
idx0 = midval + 1
else:
idxn = midval - 1
if idx0 > idxn:
return None
# Initialize the sorted list
list = [2,7,19,34,53,72]
5
None
Time Complexity
The complexity of the divide and conquer algorithm is calculated using the master
theorem.
T(n) = aT(n/b) + f(n),
where,
n = size of input
a = number of subproblems in the recursion
n/b = size of each subproblem. All subproblems are assumed to have the same size.
f(n) = cost of the work done outside the recursive call, which includes the cost of
dividing the problem and cost of merging the solutions
23
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
The complexity for the multiplication of two matrices using the naive method is O(n3), whereas using the
divide and conquer approach (i.e. Strassen's matrix multiplication) is O(n2.8074). This approach also
simplifies other problems, such as the Tower of Hanoi.
Merge Sort
Quick Sort
Binary Search
Strassen's Matrix Multiplication
Closest pair (points)
24
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Recursion
Recursion is a technique by which a function makes one or more calls to itself during
execution, until the condition gets satisfied. Recursion provides a powerful alternative for
performing repetitive tasks.
Example
The factorial function (commonly denoted as n!) is a classic mathematical function that
has a natural recursive definition.
An English ruler has a recursive pattern that is a simple example of a fractal structure.
Binary search is among the most important computer algorithms. It allows us to
efficiently locate a desired value in a data set with upwards of billions of entries.
The file system for a computer has a recursive structure in which directories can be
nested arbitrarily deeply within other directories. Recursive algorithms are widely used
to explore and manage these file systems.
The factorial of a positive integer n, denoted n!, is defined as the product of the integers
from 1 to n. If n = 0, then n! is defined as 1 by convention. More formally, for any integer n ≥
0.
The factorial function is used to find the number of ways in which n distinct items can
be arranged into a sequence, that is, the number of permutations of n items. For example, the
three characters a, b, and c can be arranged in 3! = 3 · 2 · 1 = 6 ways: abc, acb, bac, bca,
cab, and cba.
Recursion is not just a mathematical notation; we can use recursion to design a Python
implementation of a factorial function, as shown in Code Fragment 4.1.
def factorial(n):
if n == 0:
return 1
else:
return n*factorial(n−1)
25
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
This is to draw the markings of a typical English ruler. For each inch, we place a tick
with a numeric label. We denote the length of the tick designating a whole inch as the major
tick length. Between the marks for whole inches, the ruler contains a series of minor ticks,
placed at intervals of 1/2 inch, 1/4 inch, and so on.
26
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Python Code:
"""Draw one line with given tick length (followed by optional label)."""
if tick_label:
print(line)
def drawinterval(centerlength):
drawinterval(centerlength-1)
drawruler(2,4)
27
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
3) Binary Search
Binary search, that is used to efficiently locate a target value within a sorted sequence
of n elements.
The algorithm maintains two parameters, low and high, such that all the candidate
entries have index at least low and at most high. Initially, low = 0 and high = n− 1. We then
compare the target value to the median candidate, that is, the item data[mid] with index
If the target equals data[mid], then we have found the item we are looking for,
and the search terminates successfully.
If target < data[mid], then we recur on the first half of the sequence, that is, on
the interval of indices from low to mid−1.
If target > data[mid], then we recur on the second half of the sequence, that is,
on the interval of indices from mid+1 to high.
28
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
An unsuccessful search occurs if low > high, as the interval [low,high] is empty.
This algorithm is known as binary search. Whereas sequential search runs in O(n)
time, the more efficient binary search runs in O(logn) time.
4) File Systems
Modern operating systems define file-system directories (which are also sometimes
called “folders”) in a recursive way. Namely, a file system consists of a top-level directory, and
the contents of this directory consists of files and other directories, which in turn can contain
files and other directories, and so on.
29
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
With a recursive algorithm, we will account for each operation that is performed based
upon the particular activation of the function that manages the flow of control at the time it is
executed. Stated another way, for each invocation of the function, we only account for the
number of operations that are performed within the body of that activation. We can then
account for the overall number of operations that are executed as part of the recursive
algorithm by taking the sum, over all activations, of the number of operations that take place
during each individual activation
Computing Factorials
It is relatively easy to analyze the efficiency of our function for computing factorials.
Sample recursion trace is,
In analyzing the English ruler application, the fundamental question of how many total
lines of output are generated by an initial call to draw interval(c), where c denotes the center
length. This is a reasonable benchmark for the overall efficiency of the algorithm as each line
of output is based upon a call to the draw line utility, and each recursive call to draw interval
with nonzero parameter makes exactly one direct call to draw line. Some intuition may be
gained by examining the source code and the recursion trace. We know that a call to draw
interval(c) for c > 0 spawns two calls to draw interval(c−1) and a single call to draw line. We
will rely on this intuition to prove the following claim. Proposition 4.1: For c ≥ 0, a call to draw
interval(c) results in precisely 2c − 1 lines of output.
30
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Justification:
In fact, induction is a natural mathematical technique for proving the correctness and
efficiency of a recursive process. In the case of the ruler, we note that an application of draw
interval(0) generates no output, and that 20 −1 = 1−1 = 0. This serves as a base case for our
claim. More generally, the number of lines printed by draw interval(c) is one more than twice
the number generated by a call to draw interval(c−1), as one center line is printed between
two such recursive calls. By induction, we have that the number of lines is thus 1+2 ·(2c−1 −1)
= 1+2c −2 = 2c −1. This proof is indicative of a more mathematically rigorous tool, known as a
recurrence equation that can be used to analyze the running time of a recursive algorithm.
Considering the running time of the binary search algorithm, a constant number of
primitive operations are executed at each recursive call of method of a binary search. Hence,
the running time is proportional to the number of recursive calls performed. The most log n+1
recursive calls are made during a binary search of a sequence having n elements, leading to
the following claim. The binary search algorithm runs in O(logn) time for a sorted sequence
with n elements.
Justification:
Each recursive call the number of candidate entries still to be searched is given by the
value high−low+1. Moreover, the number of remaining candidates is reduced by at least one
half with each recursive call. Specifically, from the definition of mid, the number of remaining
candidates is either.
Initially, the number of candidates is n; after the first call in a binary search, it is at most
n/2; after the second call, it is at most n/4; and so on. In general, after the j th call in a binary
search, the number of candidate entries remaining is at most n/2j . In the worst case (an
unsuccessful search), the recursive calls stop when there are no more candidate entries.
Hence, the maximum number of recursive calls performed, is the smallest integer r such that
n 2r < 1. In other words (recalling that we omit a logarithm’s base when it is 2), r > logn. Thus,
we have r = logn+1, which implies that binary search runs in O(logn) time
To characterize the “problem size” for our analysis, we let n denote the number of file-
system entries in the portion of the file system that is considered. (For example, the file system
portrayed in Figure 4.6 has n = 19 entries.) To characterize the cumulative time spent for an
31
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
initial call to the disk usage function, we must analyze the total number of recursive invocations
that are made, as well as the number of operations that are executed within those invocations.
Intuitively, a call to disk usage for a particular entry ‘e’ of the file system is only made
from ‘e’, and that entry will only be explored once.
The fact that each iteration of that loop makes a recursive call to disk usage, and yet
we have already concluded that there are a total of n calls to disk usage (including the original
call). We therefore conclude that there are O(n) recursive calls, each of which uses O(1) time
outside the loop, and that the overall number of operations due to the loop is O(n). Summing
all of these bounds, the overall number of operations is O(n).
32
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
The number of bits allocated for each primitive data type depends on the programming
languages, the compiler and the operating system.
For the same primitive data type, different languages may use different sizes.
Depending on the size of the data types, the total available values (domain) will also change.
For example, “int” may take 2 bytes or 4 bytes. If it takes 2 bytes (16 bits), then the
total possible values are minus 32,768 to plus 32,767 (-215 to 215-1). If it takes 4 bytes (32
bits), then the possible values are between -2,147,483,648 and +2,147,483,647 (-231 to 231-
1). The same is the case with other data types.
33
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
A data structure is a special format for organizing and storing data. General data
structure types include arrays, files, linked lists, stacks, queues, trees, graphs and so on.
Depending on the organization of the elements, data structures are classified into two types:
1) Linear data structures: Elements are accessed in a sequential order but it is not
compulsory to store all elements sequentially. Examples: Linked Lists, Stacks and Queues.
2) Non – linear data structures: Elements of this data structure are stored/accessed in
a non-linear order. Examples: Trees and graphs.
7. Characteristics of Python?
Robustness
Adaptable
Reusable
Modular
34
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
9. Is Python Adaptable?
Software, needs to be able to evolve over time in response to changing conditions in
its environment. Thus, another important goal of quality software is that it achieves adaptability
(also called evolvability).
Related to this concept is portability, which is the ability of software to run with minimal
change on different hardware and operating system platforms. An advantage of writing
software in Python is the portability provided by the language itself.
Such reuse should be done with care, however, for one of the major sources of
software errors in the Therac-25 came from inappropriate reuse of Therac-20 software.
Python supports abstract data types using a mechanism known as an abstract base
class (ABC). An abstract base class cannot be instantiated (i.e., you cannot directly create an
instance of that class), but it defines one or more common methods that all implementations
of the abstraction must have.
An ABC is realized by one or more concrete classes that inherit from the abstract base
class while providing implementations for those method declared by the ABC.
35
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Wrapping up on Class and Data together into single unit is called Encapsulation.
Different components of a software system should not reveal the internal details of their
respective implementations.
Encapsulation yields robustness and adaptability, for it allows to fix bugs or add new
functionality with relatively local changes to a component.
A class is a collection of objects. A class contains the blueprints or the prototype from
which the objects are being created. It is a logical entity that contains some attributes and
methods.
A class provides a set of behaviors in the form of member functions (also known as
methods), with implementations that are common to all instances of that class.
A class determines the way that state information for each instance, it is represented
in the form of attributes (also known as fields, instance variables, or data members)
Example:
class Person:
init (self,a,b):
Print(“Sum=”,a+b)
Obj=Person(2,3);
Output:
Sum=5
init is a Special method that serves as the constructor of the class. Its primary
responsibility is to establish the state of a newly created credit card object with appropriate
instance variables.
Example:
class Person:
init (self,a,b):
Print(“Sum=”,a+b)
Obj=Person(2,3);
Output:
Sum=5
36
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
This allows a new class to be defined based upon an existing class as the starting
point. In object-oriented terminology, the existing class is typically described as the base class,
parent class, or superclass, while the newly defined class is known as the subclass or child
class.
Syntax:
class BaseClass:
Body of base class
class DerivedClass(BaseClass):
Body of derived class
Derived class inherits features from the base class where new features can be added to
it. This results in re-usability of code.
37
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
38
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
1. Built-In
2. Global
3. Local
39
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
This is the namespace that holds all the global objects. This namespace gets created
when the program starts running and exists till the end of the execution.
This namespace gets created when the interpreter starts. It stores all the keywords or
the built-in names. This is the superset of all the Namespaces. This is the reason we can use
print, True, etc. from any part of the code.
This is the namespace that generally exists for some part of the time during the
execution of the program. This stores the names of those objects in a function.
These namespaces exist as long as the functions exist. This is the reason we cannot
globally access a variable, created inside a function.
Output:
NameError: name 'var2' is not defined
In Python, we use = operator to create a copy of an object. It only creates a new variable
that shares the reference of the original object.
In Python, there are two other ways to create copies:
o Shallow Copy
o Deep Copy
To make these copy work, copy module is used.
40
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
For example:
import copy
copy.copy(x)
copy.deepcopy(x)
31. Give the procedure to create a new object from original elements.
A deep copy creates a new object and recursively adds the copies of nested objects
present in the original elements.
41
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Worst case
Defines the input for which the algorithm takes a long time (slowest time to
complete).
Input is the one for which the algorithm runs the slowest.
Best case
Defines the input for which the algorithm takes the least time (fastest time to
complete).
Input is the one for which the algorithm runs the fastest.
Average case
Provides a prediction about the running time of the algorithm.
Run the algorithm many times, using many different inputs and divide by the
number of trials.
Assumes that the input is random.
42
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
There are some general rules to help us determine the running time of an algorithm.
1) Loops: The running time of a loop is, at most, the running time of the statements inside the
loop (including tests) multiplied by the number of iterations.
38. How will you analyse the running time complexity of Nested Loops?
Nested loops: Analyze from the inside out. Total running time is the product of the sizes
of all the loops.
Example:
//outer loop
For(i=1;i<=n;i++)
For(j=1;j<=n;j++)
K=k+1 //constant time, c
Total time = c × n × n = c n2 = O(n2 ).
43
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
39. How will you analyse the running time complexity of Consecutive Statements?
40. How will you analyse the running time complexity of if-then-else statements?
If-then-else statements - Worst-case running time: the test, plus either the then part or
the else part (whichever is the larger).
//test : constant
If(length()==0)
Return false;
Else:
For(int n=0;n<length();n++)
If(!list[n].equals(otherList.list[n]))
Return false;
Total time = c0 + c1 + (c2 + c3 ) * n = O(n).
Recursion is a technique by which a function makes one or more calls to itself during
execution, until the condition gets satisfied. Recursion provides a powerful alternative for
performing repetitive tasks.
def factorial(n):
if n == 0:
return 1
else:
return n*factorial(n−1)
44
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
42. Give the procedure for drawing an English Ruler using Recursion.
45
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
List ADT:
Array Implementation:
Basic structure for storing and accessing a collection of data is the array. A one-
dimensional array is a collection of contiguous elements in which individual elements are
identified by a unique integer subscript starting with zero. Once an array is created, its size
cannot be changed.
Python’s list structure is a mutable sequence container that can change size as items
are added or removed. It is an abstract data type that is implemented using an array structure
to store the items contained in the list.
46
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Appending Items
pyList.append( 50 )
If there is room in the array, the item is stored in the next available slot of the array and
the length field is incremented by one.
pyList.append( 18 )
pyList.append( 64 )
pyList.append( 6)
After the second statement is executed, the array becomes full and there is no
available space to add more values.
By definition, a list can contain any number of items and never becomes full. Thus,
when the third statement is executed, the array will have to be expanded to make room for
value 6. Array cannot change size once it has been created. To allow for the expansion of the
list, the following steps have to be performed:
(2) The items from the original array are copied to the new array,
(3) The new larger array is set as the data structure for the list,
(4) The original smaller array is destroyed. After the array has been expanded, the
value can be appended to the end of the list.
47
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Extending a List
A list can be appended to a second list using the extend() method as shown in the
following example:
pyListA = [ 34, 12 ]
pyListB = [ 4, 6, 31, 9 ]
pyListA.extend( pyListB )
If the list being extended has the capacity to store all of the elements from the second
list, the elements are simply copied, element by element. If there is not enough capacity for all
of the elements, the underlying array has to be expanded as was done with the append()
method.
48
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Inserting Items
An item can be inserted anywhere within the list using the insert() method. In the
following example pyList.insert( 3, 79 ) we insert the value 79 at index position 3. Since there
is already an item at that position, we must make room for the new item by shifting all of the
items down one position starting with the item at index position 3. After shifting the items, the
value 79 is then inserted at position 3.
Removing Items
An item can be removed from any position within the list using the pop() method.
Consider the following code segment, which removes both the first and last items from the
sample list:
The first statement removes the first item from the list. After the item is removed,
typically by setting the reference variable to None, the items following it within the array are
shifted down, from left to right, to close the gap. Finally, the length of the list is decremented
to reflect the smaller size.
The second pop() operation in the example code removes the last item from the list.
Since there are no items following the last one, the only operations required are to remove the
item and decrement the size of the list. After removing an item from the list, the size of the
array may be reduced using a technique similar to that for expansion. This reduction occurs
when the number of available slots in the internal array falls below a certain threshold. For
49
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
example, when more than half of the array elements are empty, the size of the array may be
cut in half.
List Slice
Slicing is an operation that creates a new list consisting of a contiguous subset of
elements from the original list. The original list is not modified by this operation. Instead,
references to the corresponding elements are copied and stored in the new list. In Python,
slicing is performed on a list using the colon operator and specifying the beginning element
index and the number of elements included in the subset. Consider the following example
code segment, which creates a slice from our sample list: aSlice = theVector[2:3]
Python Code
import ctypes
class dy_array:
def init (self):
self.n=0
self.capacity=1
self.Arr=self.makearray(self.capacity)
def makearray(self,c):
return (c*ctypes.py_object)( )
def findlength(self,obj):
for i in obj:
pass
50
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
return (i)
def getitem(self,x):
for i in range(len):
if(self.Arr[i]==x):
return i
else:
print("Data Not Found")
def append(self,obj):
if(self.n==self.capacity):
self.resize(2*self.capacity)
self.Arr[self.n]=obj
self.n+=1
def resize(self,c):
B=self.makearray(2*self.capacity)
for i in range(self.n):
B[i]=self.Arr[i]
self.Arr=B
self.capacity=c
def insert(self,pos,val):
if(self.n==self.capacity):
self.resize(2*self.capacity)
for i in range(self.n,pos,-1):
self.Arr[i]=self.Arr[i-1]
self.Arr[pos]=val
self.n+=1
def extend(self,val):
len=self.findlength(val)
print("..",len)
for i in range(len):
self.append(val[i])
def remove(self,val):
for i in range(self.n):
if self.Arr[i]==val:
for j in range(i,self.n-1):
self.Arr[j]=self.Arr[j+1]
self.n-=1
def disp(self):
for i in range(self.n):
print(self.Arr[i])
51
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
52
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
After Insertion
53
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
def len(self):
return self.size
def insert_first(self,data):
newnode=L.Node(data)
newnode.next=self.head
self.head=newnode
if self.tail==None:
self.tail=newnode
self.size+=1
def insert_last(self,data):
newnode=L.Node(data)
if self.tail.data==None:
self.head=self.tail=newnode
else:
self.tail.next=newnode;
self.tail=newnode
self.size+=1
def remove_first(self):
if self.head==None:
print("Invalid")
else:
self.head=self.head.next
self.size-=1
def display(self):
n=L.Node(None);
n=self.head
for i in range(self.size):
print(n.data)
n=n.next
def length(self):
return self.size
54
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
A circularly linked list, is a collection of nodes that collectively form a linear sequence
and the next of tail node point back to the head of the list.
A circularly linked list provides a more general model than a standard linked list for
data sets that are cyclic, that is, which do not have any particular notion of a beginning and
end.
A A
A1
2 3
Insert at first:
def insert_first(self,data):
newnode=L.Node(data)
newnode.next=self.head
if self.head==None:
self.head=newnode
self.tail=newnode
self.tail.next=self.head
self.size+=1
55
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Insert at Last:
def insert_last(self,data):
newnode=L.Node(data)
if self.tail.data==None:
self.head=self.tail=newnode
else:
self.tail.next=newnode;
self.tail=newnode
self.tail.next=self.head
self.size+=1
Remove First:
def remove_first(self):
if self.head==None:
print("Invalid")
else:
self.tail.next=self.head.next
self.head=self.head.next
self.size-=1
def length(self):
return self.size
56
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
A linked list in which each node keeps an explicit reference to the node before it
and a reference to the node after it is known as a doubly linked list.
Advantages:
Deletion operation is easier.
Finding the predecessor and successor of node is easier.
Newnode
Algorithm
1. Create a newnode
2. If there is no list already, make newnode as Head and Tail.
3. else Find the node predata.
4. Update,
Newnode’s next = predata.next
predate.next.prev = newnode.
newnode.prev = predata.
predata.next = newnode.
5. Increase the size.
57
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Algorithm:
def len(self):
return self.size
def insert(self,predata,data):
newnode=List.Node(data)
if(self.head==None):
self.head=newnode
self.tail=newnode
self.size+=1
else:
temp=self.head
while(temp!=None):
if(temp.data==predata):
newnode.next=temp.next
temp.next.prev=newnode
temp.next=newnode
58
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
newnode.prev=temp
self.size+=1
break
temp=temp.next
else:
print("data not found")
def remove(self,x):
print("size:",self.size)
temp=self.head
if(self.head==None):
print("Empty List")
return
elif(self.head.data==x and self.size==1):
self.head=None
self.tail=None
self.size-=1
print("First node deleted")
return
elif(self.head.data==x):
self.head=self.head.next
self.head.prev=None
self.size-=1
return
else:
while(temp.data!=x):
temp=temp.next
else:
print("Data not found")
return
temp.prev.next=temp.next
temp.next.prev=temp.prev
self.size-=1
return
def display(self):
if self.head==None:
print("List empty")
else:
n=L.Node(None)
n=self.head
for i in range(self.size):
print(n.data)
n=n.next
59
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Stack ADT
A stack is an ordered list in which all insertions and deletions are made at one end,
called the top.
Stack is a list with the restriction that insertions and deletions can be performed in
only one position, namely the end of the list called Top.
It follows LIFO approach. LIFO represents “Last In First Out”. The basic
operations are push and pop.
PushEquivalent to insert.
Stack Model:
Pop Push(x)
Stack
Pop
Push
Top/Tos 3
10
6
4
5
60
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Implementation of Stack:
There are two methods of implementing stack operations.
Array implementation
Linked List implementation
Push Operation:
The process of putting a new data element onto stack is known as a Push
Operation.
Push operation involves a series of steps –
Step 1 − Checks if the stack is full.
Step 2 − If the stack is full, produces an error and exit.
Step 3 − If the stack is not full, increments top to point next empty space.
Step 4 − Adds data element to the stack location, where top is pointing.
Pop Operation:
POP operation is performed on the stack to remove items from the stack
Pop operation involves a series of steps –
Step 1 - Check if top== (-1) then stack is empty else goto step 4
Step 2 - Access the element top is pointing num = stk[top];
Step 3 - Decrease the top by 1 top = top-1;
isFull():
To check whether the stack is full or not before every push operation.
isEmpty():
To check whether the stack is Empty or not before every pop operation.
61
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
def push(self,data):
newnode=Stack.node(data)
if(self.top==None):
self.top=newnode
else:
newnode.next=self.top
self.top=newnode
self.size+=1
def pop(self):
if(self.isempty()):
print("Stack is empty")
else:
self.top=self.top.next
self.size-=1
def isFull():
return(self.size==MaxSize)
def isempty(self):
return(self.size==0)
def length(self):
return(self.size)
def display(self):
temp=self.top
for i in range(self.size):
print(temp.data)
temp=temp.next
62
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Queue ADT
Queue is an ordered collection of data items. It delete item at front of the queue. It
inserts item at rear of the queue. It has FIFO structure i.e. “First In First Out”.
Queue Model:
Queue Operations:
1. Enqueue:
To add an item to the queue. If the queue is full, then it is said to be an Overflow
condition.
Code for Enqueue:
def Enqueue(self,data):
newnode=Queue.Node(data)
if(self.size==0):
self.Front=self.Rear=newnode
else:
self.Rear.next=newnode
self.Rear=newnode
self.size+=1
2. Dequeue:
Dequeue: Removes an item from the queue. The items are popped in the same
order in which they are pushed. If the queue is empty, then it is said to be an Underflow
condition.
Code for Dequeue:
def Dequeue(self):
if(self.size==0):
print("Queue is Empty")
63
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
else:
self.Front=self.Front.next
self.size-=1
3. isFull:
To check whether the queue is full before every Enqueue Operation.
Code for isFull:
def isFull(self):
return(self.size==MaxSize)
def Enqueue(self,data):
newnode=Queue.Node(data)
if(self.size==0):
self.Front=self.Rear=newnode
else:
self.Rear.next=newnode
self.Rear=newnode
self.size+=1
def Dequeue(self):
if(self.size==0):
print("Queue is Empty")
else:
self.Front=self.Front.next
self.size-=1
def isempty(self):
return(self.size==0)
def length(self):
return(self.size)
64
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Data structure that supports insertion and deletion at both the front and the back of the
queue is called a double ended queue, or deque.
class Deque:
class Node:
def init (self,data):
self.data=data
self.prev=None
self.next=None
def init (self):
self.Front=None
self.Rear=None
self.size=0
def Enqueue_Front(self,data):
newnode=Deque.Node(data)
if(self.isempty()):
self.Front=self.Rear=newnode
else:
newnode.next=self.Front
self.Front.prev=newnode
self.Front=newnode
self.size+=1
def Enqueue_Rear(self,data):
newnode=Deque.Node(data)
if(self.isempty()):
65
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
self.Front=self.Rear=newnode
else:
self.Rear.next=newnode
newnode.prev=self.Rear
self.Rear=newnode
self.size+=1
def Dequeue_Front(self):
if(self.isempty()):
print("Deque is Empty")
else:
self.Front=self.Front.next
self.size-=1
def Dequeue_Rear(self):
if(self.isempty()):
print("Deque is Empty")
else:
self.Rear=self.Rear.prev
self.Rear.next=None
self.size-=1
def isempty(self):
return(self.size==0)
def display(self):
temp=self.Front
for i in range(self.size):
print(temp.data)
temp=temp.next
66
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
5. What is abstract data type? What are all not concerned in an ADT?
The abstract data type is a triple of D i.e. set of axioms, F-set of functions and A-
Axioms in which only what is to be done is mentioned but how is to be done is not mentioned.
Thus ADT is not concerned with implementation details.
6. List out the areas in which data structures are applied extensively.
Following are the areas in which data structures are applied extensively.
Operating system - the data structures like priority queues are used for scheduling
the jobs in the operating system.
Compiler design - the tree data structure is used in parsing the source program.
Stack data structure is used in handling recursive calls.
67
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
68
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
12. State the properties of LIST abstract data type with suitable example.
Various properties of LIST abstract data type are
It is linear data structure in which the elements are arranged adjacent to each other.
It allows to store single variable polynomial.
If the LIST is implemented using dynamic memory then it is called linked list. Example
of LIST are- stacks, queues, linked list.
13. State the advantages of circular lists over doubly linked list.
In circular list the next pointer of last node points to head node, whereas in doubly
linked list each node has two pointers: one previous pointer and another is next pointer. The
main advantage of circular list over doubly linked list is that with the help of single pointer field
we can access head node quickly. Hence some amount of memory get saved because in
circular list only one pointer is reserved.
14. What are the advantages of doubly linked list over singly linked list?
The doubly linked list has two pointer fields. One field is previous link field and another
is next link field. Because of these two pointer fields we can access any node efficiently
whereas in singly linked list only one pointer field is there which stores forward pointer.
69
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
A2 A3
A1
70
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
28. Write down the function to insert an element into a queue, in which the queue is
implemented as an array.
void enqueue (int X, Queue Q)
{
if(IsFull(Q))
Error (“Full queue”);
else
{
Q->Size++;
Q->Rear = Q->Rear+1;
Q->Array[ Q->Rear ]=X;
}
}
71
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
72
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Bubble sort – selection sort – insertion sort – merge sort – quick sort – linear
search – binary search – hashing – hash functions – collision handling – load factors,
rehashing, and efficiency.
Sorting:
Sorting is used to arrange the data (collection of items) in the array/list in ascending or
descending order. Given a collection, the goal is to rearrange the elements so that they are
ordered from Smallest to Largest. Or Largest to Smallest.
1) Bubble Sort:
There is a simple, but inefficient algorithm, called bubble-sort, for sorting a list L of n
comparable elements. This algorithm scans the list n−1 times, where, in each scan, the
algorithm compares the current element with the next one and swaps them if they are out of
order.
This algorithm uses multiple passes and in each pass the first and second data items
are compared.
If the first data item is bigger than the second, then the two items are swapped.
Next the items in second and third position are compared and if the first one is
larger than the second, then they are swapped, otherwise no change in their order.
This process continues for each successive pair of data items until all items are
sorted.
Algorithm:
#Bubble Sort
Function BubbleSort(A):
for i in range(len(A)):
for j in range(len(A)-1):
if(A[j]>A[j+1]):
A[j],A[j+1]=A[j+1],A[j]
Output:
print(A)
[1, 2, 3, 4, 7, 8, 9]
BubbleSort([4,2,7,3,1,8,9])
73
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Step-by-step example:
Let us take the array of numbers "6 2 5 3 9", and sort the array from lowest number to
greatest number using bubble sort.
In each step, elements written in bold are being compared. Three passes will be
required.
Time Complexity:
The efficiency of Bubble sort algorithm is independent of number of data items in the
array and its initial arrangement. If an array containing n data items, then the outer loop
executes n-1 times as the algorithm requires n-1 passes.
In the first pass, the inner loop is executed n-1 times; in the second pass, n-2 times; in
the third pass, n-3 times and so on. The total number of iterations resulting in a run time of
O(n2).
Worst Case Performance O(n2)
Best Case Performance O(n2)
Average Case Performance O(n2)
The total no. of iterations for the inner loop will be the sum of the first n - 1 integers,
which Equals resulting in a run time of O(n2).
74
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
2) Selection Sort
Selection sort algorithm is one of the simplest sorting algorithm, which sorts the
elements in an array by finding the minimum element in each pass from unsorted part and
keeps it in the beginning. .
This sorting technique improves over bubble sort by making only one exchange in each
pass. This sorting technique maintains two sub arrays, one sub array which is already sorted
and the other one which is unsorted. In each iteration the minimum element (ascending order)
is picked from unsorted array and moved to sorted sub array.
Python Code
# Selection Sort
Function SelectionSort(A):
for i in range(len(A)):
min=i
for j in range(i+1,len(A)):
if(A[min]>A[j]):
min=j
A[i],A[min]=A[min],A[i]
print(A)
SelectionSort([3,20,1,4,5,2])
Step-by-step example:
Time Complexity:
Selection sort is not difficult to analyse compared to other sorting algorithms since none
of the loops depend on the data in the array. Selecting the lowest element requires scanning
all n elements (this takes n − 1 comparisons) and then swapping it into the first position.
Finding the next lowest element requires scanning the remaining n − 1 elements and so on,
for (n − 1) + (n − 2) + ... + 2 + 1 = n(n − 1) / 2 ∈ O(n2) comparisons. Each of these scans
requires one swap for n − 1 elements (the final element is already in place).
Worst Case Performance O(n2)
Best Case Performance O(n2)
Average Case Performance O(n2)
75
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
3) Insertion Sort:
We start with the first element in the array. One element by itself is already sorted.
Then we consider the next element in the array. If it is smaller than the first, we swap them.
Next we consider the third element in the array. We swap it leftward until it is in its
proper order with the first two elements. We then consider the fourth element, and swap it
leftward until it is in the proper order with the first three.
We continue in this manner with the fifth element, the sixth, and so on, until the whole
array is sorted.
Algorithm InsertionSort(A):
Input: An array A of n comparable elements
Output: The array A with elements rearranged in nondecreasing order
for k from 1 to n − 1 do
Insert A[k] at its proper location within A[0], A[1], ..., A[k].
Step-by-step example:
Algorithm:
Function InsertionSort(A):
for j in range(1,len(A)):
i=j
while(i>0):
if(A[i]<A[i-1]):
A[i],A[i-1]=A[i-1],A[i]
i-=1
print(A)
InsertionSort([5,14,30,2,1])
76
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Time Complexity:
Worst Case Performance O(n2)
Best Case Performance(nearly) O(n)
Average Case Performance O(n2)
4) Merge Sort:
erge sort is based on Divide and conquer method. It takes the list to be sorted
and divide it in half to create two unsorted lists. The two unsorted lists are then sorted and
merged to get a sorted list. The two unsorted lists are sorted by continually calling the Partition
algorithm; we eventually get a list of size 1 which is already sorted. The two lists of size 1 are
then merged.
1. Divide the input which we have to sort into two parts in the middle. Call it the left
part and right part.
2. Sort each of them separately. Note that here sort does not mean to sort it using
some other method. We use the same function recursively.
3. Then merge the two sorted parts.
Step-by-step Example:
Algorithm:
Function merge(S1, S2, S):
i=j=0
while i + j < len(S):
if j == len(S2) or (i < len(S1) and S1[i] < S2[j]):
S[i+j] = S1[i]
i += 1
else:
S[i+j] = S2[j]
j += 1
77
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Function Partition(S):
n = len(S)
if n < 2:
return
mid = n // 2
S1 = S[0:mid] # copy of first half
S2 = S[mid:n] # copy of second half
Partition(S1) # sort copy of first half
Partition(S2) # sort copy of second half
merge(S1, S2, S)
A=[85,24,63,450,170,31,96,50]
Partition(A)
print("Sorted Array is")
for i in range(len(A)):
print(A[i])
5) Quick Sort:
The quick sort algorithm also uses the divide and conquer strategy. But unlike the
merge sort, which splits the sequence of keys at the midpoint, the quick sort partitions the
sequence by dividing it into two segments based on a selected pivot key. In addition, the quick
sort can be implemented to work with virtual sub sequences without the need for temporary
storage.
Quick sort is a divide and conquer algorithm. Quick sort first divides a large list into
two smaller sublists: the low elements and the high elements. Quick sort can then recursively
sort the sub-lists.
78
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Step-by-step Example 1:
Pivot=last
15 6 8 4 11 9 2 1 5
Left=0 position
left right Pivot
Right=last-1 position
If left < right,
1 6 8 4 11 9 2 15 5
If left <Pivot, move left to next
left right Pivot
If right>pivot, move right to prev
1 2 8 4 11 9 6 15 5 If left < right, swap left & right
left right Pivot Left=left+1, right=right-1
If left<right,
If left <Pivot, move left to next
If right>pivot, move right to prev
1 2 8 4 11 9 6 15 5
left right Pivot
If left < right, swap left & right
Left=left+1, right=right-1
9
1 2 4 5 15 8 If left >= right, swap left & pivot
6 Left, 11
Pivot
right
Lock pivot
(Do quicksort for left of 8 and right
1 2 4 5 8 11 15 9 of 8 separately.)
6
Left right Pivot Pivot=last
Left=next position
Right=last-1 position
15 If left<right,
1 2 4 5 6 8 9 11
Left, If left <Pivot, move left to next
Pivot
right If right>pivot, move right to prev
79
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Example 2:
Algorithm:
Function Quicksort(S, l, r):
if l >= r:
return
pivot = S[r]
left = l
right = r-1
while left <= right:
while left <= right and S[left] < pivot:
left += 1
while left <= right and pivot < S[right]:
right-=1
if left <= right:
S[left], S[right] = S[right], S[left]
left = left + 1
right = right - 1
else:
S[left], S[pivot] = S[pivot], S[left]
Quicksort(S, l, left - 1)
Quicksort(S, left + 1, r)
A=[41,21,5,10,6,3]
Quicksort(A,0,5)
print(A)
80
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Searching:
1) Linear Search:
The algorithm uses the guess and check pattern by first guessing that the smallest
item is the first item in the list and then checking the subsequent items to see if it made an
incorrect guess.
When the sequence is unsorted, the standard approach to search for a target
value is to use a loop to examine every element, until either finding the target or exhausting
the data set. This is known as the Linear or sequential search algorithm. This algorithm runs
in O(n) time (i.e., linear time) since every element is inspected in the worst case.
Example:
List : 10,51,2,18,4,31,13,5,23,64,29 Element to be searched : 31
Algorithm:
# Linear Search
Function LinearSearch(List,data):
for i in range(len(List)):
if(List[i]==data):
print(data,"present at position",i+1)
break
else:
print("Element not found!")
List=[10,51,2,18,4,31,13,5,23,64,29]
LinearSearch(List,31)
Output:
31 present at position 6
81
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
2) Binary Search:
Values stored in sorted order within an indexable sequence, such as Python list.
When the sequence is sorted and indexable, there is a much more efficient algorithm.
Initially, low = 0 and high = n− 1. We then compare the target value to the median candidate,
that is, the item data[mid] with index mid = (low +high)/2 .
82
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Function BinarySearch(List,data,low,high):
if(low<=high):
mid=(low+high)//2
if(data==List[mid]):
print(data,"present at position",mid+1)
elif(data<List[mid]):
BinarySearch(List,data,low,mid-1)
else:
BinarySearch(List,data,mid+1,high)
else:
print("Element not found!")
List=[1,2,3,4,5,6,7,8,9,0]
BinarySearch(List,1,0,9)
Output:
1 present at position 1
83
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Hashing:
What is Hashing?
Hashing in the data structure is a technique of mapping a large chunk of data into small
tables using a hashing function. It is also known as the message digest function. It is a
technique that uniquely identifies a specific item from a collection of similar items. It uses hash
tables to store the data in an array format.
Each value in the array has assigned a unique index number. Hash tables use a
technique to generate these unique index numbers for each value stored in an array format.
This technique is called the hash technique.
Hash table:
The hash table data structure is merely an array of some fixed size, containing the
keys. A key is a string with an associated value.
Each key is mapped into some number in the range 0 to tablesize-1 and placed in the
appropriate cell. In the following example, tablesize is 5 ie., 0 to 4.
21%5=1 1 21
18%5=3 2 32
32%5=2
3 18
Hash function:
A hash function is a key to address transformation which acts upon a given key to
compute the relative position of the key in an array.
The choice of hash function should be simple and it must distribute the data evenly.
Importance of hashing:
84
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
1. Division method: The hash function depends upon the remainder of division.
2. Mid square: In the mid square method, the key is squared and the middle or mid part of
the result is used as the index.
Consider that if we want to place a record 3111 then for the hash table size 1000
31112 = 9678321
H(3111) = 783 ( the middle 3 digits)
3. Digital Folding: The Key is divided into separate part and using some simple operation
these parts are combined to produce the hash key.
Collision:
Separate Chaining:
Separate chaining is a collision resolution technique to keep the list of all elements that
hash to the same value. This is called separate chaining because each hash table element
is a separate chain (linked list). Each linked list contains all the elements whose keys hash
to the same index.
More number of elements can be inserted as it uses linked lists. For ex, insert
18,54,28,25,41,38,36,12,90.
85
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
In the worst case, operations on an individual bucket take time proportional to the size
of the bucket. Assuming we use a good hash function to index the n items of our map in a
bucket array of capacity N, the expected size of a bucket is n/N.
Therefore, if given a good hash function, the core map operations run in O( n/N). The
ratio λ = n/N, called the load factor of the hash table, should be bounded by a small constant,
preferably below 1. As long as λ is O(1), the core operations on the hash table run in O(1)
expected time.
86
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Open addressing:
Open addressing requires that the load factor is always at most 1 and that items are
stored directly in the cells of the bucket array itself.
(i) Linear probing - With this approach, if we try to insert an item (k,v) into a bucket A[ j] that
is already occupied, where j = h(k), then we next try A[(j +1) mod N]. If A[(j +1) mod N] is also
occupied, then we try A[(j + 2) mod N], and so on, until we find an empty bucket that can
accept the new item. Once this bucket is located, we simply insert the item there.
Example:
(ii) Quadratic probing - Another open addressing strategy, known as quadratic probing,
iteratively tries the buckets A[(h(k)+ f(i)) mod N], for i = 0,1,2,..., where f(i) = i 2, until finding an
empty bucket. As with linear probing, the quadratic probing strategy complicates the removal
operation, but it does avoid the kinds of clustering patterns that occur with linear probing.
Example: If we have to insert following elements in the hash table with size 10.
(22+12)%11=1
0 1 2 3 4 5 6 7 8 9 10
55 22 90 37 49 17 87
87
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
(iii) Double hashing - in which F(i)=i.hash2(X). This formula says that we apply a second
hash function to X and probe at a distance hash2(X), 2hash2(X),…., and so on.
In this approach, we choose a secondary hash function, h, and if h maps some key k
to a bucket A[h(k)] that is already occupied, then we iteratively try the buckets A[(h(k) + f(i))
mod N] next, for i = 1,2,3,..., where f(i) = i · h (k). In this scheme, the secondary hash function
is not allowed to evaluate to zero; a common choice is h(k) = q−(k mod q), for some prime
number q < N. Also, N should be a prime.
A function such as hash2(X)=R-(X mod R), with R a prime smaller than Tablesize.
Example:
Insert 37, 90, 55, 22, 14 into a hash table with size 7 using Double Hashing method.
55 22 90 37 49 22 87
Linear probing has the best cache performance, but suffers from clustering. One
more advantage of Linear probing is easy to compute.
Quadratic probing lies between the two in terms of cache performance and
clustering.
Double hashing has poor cache performance but no clustering. Double hashing
requires more computation time as two hash functions need to be computed.
88
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
2. What is searching?
89
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
90
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
9. Define Hashing.
Hashing is the process of mapping large amount of data item to a smaller table with
the help of a hashing function. Modulo operator is used to get the key value from the actual
data/information.
The hash table data structure is merely an array of some fixed size, containing the
keys. A key is a string with an associated value.
Each key is mapped into some number in the range 0 to tablesize-1 and placed in the
appropriate cell. In the following example, tablesize is 5 ie., 0 to 4.
21%5=1 21
18%5=3 18
32%5=2 32
A hash function is a key to address transformation which acts upon a given key to
compute the relative position of the key in an array.
The choice of hash function should be simple and it must distribute the data evenly.
91
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
13. What do you mean by collision in hashing? Name some collision resolution
techniques.
Separate chaining is a collision resolution technique to keep the list of all elements
that hash to the same value. This is called separate chaining because each hash table
element is a separate chain (linked list). Each linked list contains all the elements whose keys
hash to the same index.
More number of elements can be inserted as it uses linked lists. For ex, insert
12,17,22,24.
0
12%5=>2
1
17%5=>2
2
12 12 12
22%5=>2 3
24%5=>4 4
12
92
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
In linear probing collision resolution strategy, even if the table is relatively empty,
blocks of occupied cells start forming. This effect is known as primary clustering means that
any key hashes into the cluster will require several attempts to resolve the collision and then
it will add to the cluster.
93
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
19. What are the types of collision resolution strategies in open addressing?
Quadratic probing - If collision occurs, alternative cells are tried until an empty cell
is found. In linear probing method, the hash table is represented one-dimensional array with
indices that range from 0 to the desired table.
In Quadratic probing the alternative cells are calculated using the formula, F(i) = i2.
Double hashing - in which F(i)=i.hash2(X). This formula says that we apply a second
hash function to X and probe at a distance hash2(X), 2hash2(X),….,and so on. A function
such as hash2(X)=R-(XmodR), with R a prime smaller than Tablesize.
Linear probing has the best cache performance, but suffers from clustering. One more
advantage of Linear probing is easy to compute.
Quadratic probing lies between the two in terms of cache performance and clustering.
Double hashing has poor cache performance but no clustering. Double hashing
requires more computation time as two hash functions need to be computed.
Although quadratic probing eliminates primary clustering, elements that hash to the
same position will probe the same alternative cells. This is known as secondary clustering.
Building another table that is about twice as big with the associated new hash function
and scan down the entire original hash table, computing the new hash value for each element
and inserting it in the new table. This entire operation is called rehashing.
Advantage:
94
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
If either open addressing hashing or separate chaining hashing is used, the major
problem is that collisions could cause several blocks to be examined during a Find, even for
a well-distributed hash table. Extendible hashing allows a find to be performed in two disk
accesses. Insertions also require few disk accesses.
95
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
A linear search scans one item at a time, without jumping to any item.
The worst case complexity is O(n), sometimes known an O(n) search
Time taken to search elements keep increasing as the number of elements are
increased.
After storing a large amount of data. Linear search and binary search perform
lookups/search with time complexity of O(n) and O(log n) respectively.
As the size of the dataset increases, these complexities also become significantly
high which is not acceptable.
We need a technique that does not depend on the size of data. Hashing allows
lookups to occur in constant time i.e. O(1).
96
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Tree ADT:
Tree is an abstract data type that stores elements hierarchically. With the exception of
the top element, each element in a tree has a parent element and zero or more children
elements.
A tree is usually visualized by placing elements inside ovals or rectangles, and by
drawing the connections between parents and children with straight lines.
Formal Tree Definition
Formally, we define a tree T as a set of nodes storing elements such that the nodes
have a parent-child relationship that satisfies the following properties:
If T is nonempty, it has a special node, called the root of T, which has no parent.
Each node v of T different from the root has a unique parent node w; every node with
parent w is a child of w.
Node Relationships
A node v is external, if v has no children. External nodes are also known as leaves.
A node v is internal if it has one or more children.
A node u is an ancestor of a node v, if u = v or u is an ancestor of the parent of v.
Conversely, we say that a node v is a descendant of a node u if u is an ancestor of v.
A tree is ordered if there is a meaningful linear order among the children of each node;
Path: Path refers to the sequence of nodes along the edges of a tree.
Root: The node at the top of the tree is called root. There is only one root per tree
and one path from the root node to any node.
Parent: Any node except the root node has one edge upward to a node called parent.
Child: The node below a given node connected by its edge downward is called its
child node.
Sub tree: Sub tree represents the descendants of a node.
Traversing: Traversing means passing through nodes in a specific order.
Levels: Level of a node represents the generation of a node. If the root node is at level
0, then its next child node is at level 1, its grandchild is at level 2, and soon.
Keys: Key represents a value of a node based on which a search operation is to be
carried out for a node.
Siblings: All the nodes that share the same parent are called siblings.
Depth: The depth of a node N is the length of the path from the root to the node N
Height: The Height of a node N is the length of the path from the node to the deepest
leaf.
97
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Properties of Tree:
Every tree has a special node called the root node. The root node can be used to
traverse every node of the tree. It is called root because the tree originated from root
only.
If a tree has N vertices(nodes) than the number of edges is always one less than the
number of nodes(vertices) i.e N-1. If it has more than N-1 edges it is called a graph not
a tree.
Every child has only a single Parent but Parent can have multiple child.
Example
98
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Height
The height of a position p in a tree T is also defined recursively:
If p is a leaf, then the height of p is 0.
Otherwise, the height of p is one more than the maximum of the heights of p’s
children. The height of a nonempty tree T is the height of the root of T.
def height(self, p):
if self.is leaf(p):
return 0
else:
return 1 + max(self. height2(c) for c in self.children(p))
Types of Tree:
1. Binary Tree
Binary tree is the type of tree in which each parent can have at most two children. The
children are referred to as left child or right child.
99
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
3. AVL Tree:
AVL tree is a self-balancing binary search tree. In AVL tree, the heights of children of
a node differ by at most 1. The valid balancing factor in AVL tree are 1, 0 and -1. When a new
node is added to the AVL tree and tree becomes unbalanced then rotation is done to make
sure that the tree remains balanced.
4. B-tree
B-tree is another self-balancing search tree that comprises many nodes to keep data
stored in a particular order. Each node has over two child nodes and each node comprises
multiple keys. B-trees are compatible with file systems and databases that can write and read
larger blocks of data.
5. N-ary Tree:
In an N-ary tree, the maximum number of children that a node can have is limited to
N. A binary tree is 2-ary tree as each node in binary tree has at most 2 children. Trie data
structure is one of the most commonly used implementation of N-ary tree. A full N-ary tree is
a tree in which children of a node is either 0 or N. A complete N-ary tree is the tree in which
all the leaf nodes are at the same level.
Advantages of Tree:
The tree reflects the data structural connections.
The tree is used for hierarchy.
It offers an efficient search and insertion procedure.
The trees are flexible. This allows subtrees to be relocated with minimal effort.
100
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Tree Traversal:
Traversal is a process to visit all the nodes of a tree and may print their values too.
Because, all nodes are connected via edges (links) we always start from the root (head) node.
That is, we cannot randomly access a node in a tree. There are three ways which we use to
traverse a tree −
• In-order Traversal
• Pre-order Traversal
• Post-order Traversal
Generally, we traverse a tree to search or locate a given item or key in the tree or to
print all the values it contains.
In-order Traversal
In this traversal method, the left sub tree is visited first, then the root and later the right
sub-tree. We should always remember that every node may represent a sub tree itself.
If a binary tree is traversed in-order, the output will produce sorted key values in an
ascending order.
We start from A, and following in-order traversal, we move to its left subtree B. B is
also traversed in- order. The process goes on until all the nodes are visited. The output of
inorder traversal of this tree wills be−
D→B→E→A→F→C→G
Algorithm
Until all nodes are traversed −
Step 1 − Recursively traverse left subtree.
Step 2 − Visit root node.
Step 3 − Recursively traverse right subtree.
Python Code for inorder traversal:
def Inorder(self):
if self.left:
self.left.Inorder()
print( self.data)
if self.right:
self.right.Inorder()
101
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Pre-order Traversal
In this traversal method, the root node is visited first, then the left subtree and finally
the right subtree.
We start from A, and following pre-order traversal, we first visit A itself and then move
to its left subtree B. B is also traversed pre-order. The process goes on until all the nodes are
visited. The output of pre- order traversal of this tree willbe−
A → B → D → E → C → F →G
Algorithm
Until all nodes are traversed −
Step 1 − Visit root node.
Step 2 − Recursively traverse left subtree.
Step 3 − Recursively traverse right subtree.
Post-order Traversal
In this traversal method, the root node is visited last, hence the name. First we traverse
the left subtree, then the right subtree and finally the root node.
102
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
We start from A, and following Post-order traversal, we first visit the left subtree B. B
is also traversed post-order. The process goes on until all the nodes are visited. The output of
post-order traversal of this tree will be −
D→E→B→F→G→C→A
Algorithm
Until all nodes are traversed −
Step 1 − Recursively traverse left subtree.
Step 2 − Recursively traverse right subtree.
Step 3 − Visit root node.
103
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Binary Tree:
A binary tree is an ordered tree with the following properties:
1. Every node has at most two children.
2. Each child node is labeled as being either a left child or a right child.
3. A left child precedes a right child in the order of children of a node.
The subtree rooted at a left or right child of an internal node v is called a left subtree
or right subtree, respectively, of v.
104
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Decision trees:
To represent a number of different outcomes that can result from answering a series
of yes-or-no questions. Each internal node is associated with a question. Starting at the root,
go to the left or right child of the current node, depending on whether the answer to the
question is “Yes” or “No.” Such binary trees are known as decision trees.
Decision Trees
Arithmetic expression tree:
An arithmetic expression can be represented by a binary tree whose leaves are
associated with variables or constants, and whose internal nodes are associated with one
of the operators +, −, ×, and /. Such tree is called arithmetic expression tree.
105
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Example:
Algorithm:
def height(self,root):
if root is None:
return 0;
l=self.height(root.left)
r=self.height(root.right)
return max(l,r)+1
Parent of 2 is 1. Parent of 5 is 2.
Algorithm:
def parent(self,data):
if self.data==data:
print(data,"is the root")
elif self.left.data==data or self.right.data==data:
print(self.data)
elif data<self.data and self.left is not None:
self.left.parent(data)
elif data>self.data and self.right is not None:
self.right.parent(data)
else:
print("No such data")
106
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Example:
Algorithm:
def insert(self,data):
if self.data:
if data<self.data:
if self.left is None:
self.left=Node(data)
else:
self.left.insert(data)
elif data>self.data:
if self.right is None:
self.right=Node(data)
else:
self.right.insert(data)
else:
self.data = data
b) If the tree is not empty, it will compare root’s data with value. If they are equal, it
will set the flag to true and return.
c) Traverse left subtree by calling searchNode() recursively and check whether the
value is present in left subtree.
d) Traverse right subtree by calling searchNode() recursively and check whether the
value is present in the right subtree.
107
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Algorithm:
def search(self,data):
if self.data==data:
print("Data found")
elif(data<self.data and self.left is not None):
self.left.search(data)
elif(data>self.data and self.right is not None):
self.right.search(data)
else:
print("Not Found")
108
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Ordered sequence of elements in a binary tree, is called binary search tree. A binary
search tree for S is a binary tree T such that, for each position p of T:
Binary search tree hierarchically represents the sorted order of its keys. An inorder
traversal of a binary search tree visits positions in increasing order of their keys.
109
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
print("Data found")
elif(data<self.data and self.left is not None):
self.left.search(data)
elif(data>self.data and self.right is not None):
self.right.search(data)
else:
print("Not Found")
def insert(self,data):
110
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
if self.data:
if(data<self.data and self.left is None):
self.left=BinarySearchTree(data)
elif(self.left is not None):
self.left.insert(data)
elif(data>self.data and self.right is None):
self.right=BinarySearchTree(data)
elif(self.right is not None):
self.right.insert(data)
else:
self.data=data
To delete an item with key k, we begin by calling TreeSearch(T, T.root( ), k) to find the
position p of T storing an item with key equal to k. If the search is successful, we distinguish
between three cases
Input the data of the node to be deleted.
Case 1: If the node is a leaf node, delete the node directly.
Case 2: Else if the node has one child, copy the child to the node to be deleted and
delete the child node.
Case 3: Else if the node has two children, find the inorder successor of the node.
Copy the contents of the inorder successor to the node to be deleted and delete
the inorder successor.
111
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
class BinarySearchTree:
def init (self,data):
self.left=None
self.data=data
self.right=None
self.root=self.data
# 4,3,5,1,2
def insert(self,data):
if self.data:
if(data<self.data and self.left is None):
self.left=BinarySearchTree(data)
elif(self.left is not None):
self.left.insert(data)
elif(data>self.data and self.right is None):
self.right=BinarySearchTree(data)
elif(self.right is not None):
112
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
self.right.insert(data)
else:
self.data=data
def findMin(self):
if self.data:
if self.left is not None:
self.left.findMin()
else:
print(self.data)
else:
print("Tree Not Found")
def findMax(self):
if self.data:
if self.right is not None:
self.right.findMin()
else:
print(self.data)
else:
print("Tree Not Found")
def parent(self,data):
if self.data==data:
print(data,"is the root")
elif self.left.data==data or self.right.data==data:
print(self.data)
elif data<self.data and self.left is not None:
self.left.parent(data)
113
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
def inorder(self):
if self.data is not None:
if self.left is not None:
self.left.inorder()
print(self.data)
if self.right is not None:
self.right.inorder()
114
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
AVL Tree
A tree is called an AVL tree if each node of the tree possesses one of the following properties:
A node is called left heavy if the longest path in its left subtree is one longer than the
longest path of its right subtree
A node is called right heavy if the longest path in the right subtree is one longer than
the path in its left subtree
A node is called balanced if the longest path in both the right and left subtree are equal.
AVL tree is a height-balanced tree where the difference between the heights of the
right subtree and left subtree of every node is either -1, 0 or 1. The difference between the
heights of the subtree is maintained by a factor named as balance factor. Therefore, we can
define AVL as it is a balanced binary search tree where the balance factor of every node in
the tree is either -1, 0, or +1. Here, the balance factor is calculated by the formula:
As AVL is the height-balanced tree, it helps to control the height of the binary search
tree and further help the tree to prevent skewing. When the binary tree gets skewed, the
running time complexity becomes the worst-case scenario i.e O(n) but in the case of the AVL
tree, the time complexity remains O(logn). Therefore, it is always advisable to use an AVL tree
rather than a binary search tree.
Every AVL Tree is a binary search tree but every Binary Search Tree need not be AVL
Tree.
115
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
AVL Rotation
When certain operations like insertion and deletion are performed on the AVL tree, the
balance factor of the tree may get affected. If after the insertion or deletion of the element, the
balance factor of any node is affected then this problem is overcome by using rotation.
Therefore, rotation is used to restore the balance of the search tree. Rotation is the method of
moving the nodes of trees either to left or to right to make the tree heighted balance tree.
There are total two categories of rotation which is further divided into two further parts:
1) Single Rotation
Single rotation switches the roles of the parent and child while maintaining the search
order. We rotate the node and its child, the child becomes a parent.
116
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
2) Double Rotation
Single rotation does not fix the LR rotation and RL rotation. For this, we require double
rotation involving three nodes. Therefore, double rotation is equivalent to the sequence of two
single rotations.
LR(Left-Right) Rotation
The LR rotation is the process where we perform a single left rotation followed by a
single right rotation. Therefore, first, every node moves towards the left and then the node of
this new tree moves one position towards the right. Let us see the below example
RL (Right-Left) Rotation
117
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
The RL rotation is the process where we perform a single right rotation followed by a
single left rotation. Therefore, first, every node moves towards the right and then the node of
this new tree moves one position towards the left. Let us see the below example
118
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
1. Insertion Operation
2. Deletion Operation
1. Find the appropriate empty subtree where the new value should be added by
comparing the values in the tree
2. Create a new node at the empty subtree
3. The new node is a leaf ad thus will have a balance factor of zero
4. Return to the parent node and adjust the balance factor of each node through the
rotation process and continue it until we are back at the root. Remember that the
modification of the balance factor must happen in a bottom-up fashion
Example:
The root node is added as shown in the below figure
The node to the root node is added as shown below. Here the tree is balanced
Then, The right child is added to the parent node. Here, the balance factor of the tree is
changed, therefore, the LL rotation is performed and the tree becomes a balanced tree
119
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Later, one more right child is added to the new tree as shown below
Again further, one more right child is added and the balance factor of the tree is changed.
Therefore, again LL rotation is performed on the tree and the balance factor of the tree is
restored as shown in the below figure
120
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
return self.lRotate(root)
if b < -1 and key > root.right.value:
return self.rRotate(root)
if b > 1 and key > root.left.value:
root.left = self.rRotate(root.left)
return self.lRotate(root)
if b < -1 and key < root.right.value:
root.right = self.lRotate(root.right)
return self.rRotate(root)
return root
Deletion Operation In AVL
The deletion operation in the AVL tree is the same as the deletion operation in BST. In
the AVL tree, the node is always deleted as a leaf node and after the deletion of the node, the
balance factor of each node is modified accordingly. Rotation operations are used to modify
the balance factor of each node. The algorithm steps of deletion operation in an AVL tree are:
Example:
Let us consider the below AVL tree with the given balance factor as shown in the
figure below
Here, we have to delete the node '25' from the tree. As the node to be deleted does
not have any child node, we will simply remove the node from the tree
121
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
After removal of the tree, the balance factor of the tree is changed and therefore, the
rotation is performed to restore the balance factor of the tree and create the perfectly balanced
tree.
class treeNode:
class AVLTree:
b = self.getBal(root)
if b > 1 and key < root.left.value:
return self.rRotate(root)
if b < -1 and key > root.right.value:
return self.lRotate(root)
if b > 1 and key > root.left.value:
root.left = self.lRotate(root.left)
return self.rRotate(root)
if b < -1 and key < root.right.value:
root.right = self.rRotate(root.right)
122
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
return self.lRotate(root)
return root
123
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Tree = AVLTree()
root = None
root = Tree.insert(root, 1)
root = Tree.insert(root, 2)
root = Tree.insert(root, 3)
root = Tree.insert(root, 4)
root = Tree.insert(root, 5)
root = Tree.insert(root, 6)
124
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Heap:
Heap is a data structure that follows a complete binary tree's property and satisfies the
heap property. Therefore, it is also known as a binary heap. As we all know, the complete
binary tree is a tree with every level filled and all the nodes are as far left as possible. In the
binary tree, it is possible that the last level is empty and not filled.
In the heap data structure, we assign key-value or weight to every node of the tree.
Now, the root node key value is compared with the children’s nodes and then the tree is
arranged accordingly into two categories i.e., max-heap and min-heap.
Heapify:
The process of creating a heap data structure using the binary tree is called Heapify.
The heapify process is used to create the Max-Heap or the Min-Heap.
Min Heap
When the value of each internal node is smaller than the value of its children node then
it is called the Min-Heap Property. Also, in the min-heap, the value of the root node is the
smallest among all the other nodes of the tree. Therefore, if “a” has a child node “b” then
#Python Code
def min_heapify(A,k):
l = left(k)
r = right(k)
125
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
def left(k):
return 2 * k + 1
def right(k):
return 2 * k + 2
def build_min_heap(A):
n = int((len(A)//2)-1)
for k in range(n, -1, -1):
min_heapify(A,k)
A = [3,9,2,1,4,5]
build_min_heap(A)
print(A)
Max Heap
When the value of each internal node is greater than the value of its children node then
it is called the Max-Heap Property. Also, in the max-heap, the value of the root node is the
greatest among all the other nodes of the tree. Therefore, if “a” has a child node “b” then
126
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
#Python Code
def max_heapify(A,k):
l = left(k)
r = right(k)
if l < len(A) and A[l] > A[k]:
max = l
else:
max = k
if r < len(A) and A[r] > A[max]:
max = r
if max != k:
A[k], A[max] = A[max], A[k]
max_heapify(A, max)
def left(k):
return 2 * k + 1
def right(k):
return 2 * k + 2
def build_max_heap(A):
n = int((len(A)//2)-1)
for k in range(n, -1, -1):
max_heapify(A,k)
A = [3,9,2,1,4,5]
build_max_heap(A)
print(A)
Time complexity
The running time complexity of the building heap is O(n log(n)) where each call for
heapify costs O(log(n)) and the cost of building heap is O(n). Therefore, the overall time
complexity will be O(n log(n)).
Applications of Heap
127
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
(2,4)-Tree Operations:
A multiway search tree that keeps the secondary data structures stored at each node
small and also keeps the primary multiway tree balanced is the (2,4) tree, which is sometimes
called a 2-4 tree or 2-3-4 tree. This data structure achieves these goals by maintaining two
simple properties,
Size Property: Every internal node has at most four children.
Depth Property: All the external nodes have the same depth
128
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
129
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
130
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
2. What is sibling?
Two nodes that are children of the same parent are siblings.
131
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
132
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
133
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
16. Write the in-order, pre-order, post-order and Breadth-First or Level Order Traversal
for the given tree.
Inorder 3 7 8 6 11 2 5 4 9
Preorder 2 7 3 6 8 11 5 9 4
Postorder : 3 8 11 6 7 4 9 5 2
18. What is Degree of a node in a tree? What is the Degree of and B for the given tree?
20. What is Height of a node in a tree? What is the height of E in the given node?
134
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Binary Tree
135
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Let p be a pointer to a node and x be the information. Now, the basic operations are:
i) info(p)
ii) father(p) / parent(p)
iii) left(p)
iv) right(p)
v) brother(p)
vi) isleft(p)
vii) isright(p)
136
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
33. How can you say recursive procedure is efficient than non-recursive?
There is no extra recursion. The automatic stacking and unstacking make it more
efficient. There are no extraneous parameters and local variables used.
137
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
39. In a binary max heap containing n numbers, the smallest element can be found in
time.
Time complexity : O(n) In a max heap, the smallest element is always present at a
leaf node. So we need to check for all leaf nodes for the minimum value. Worst case
complexity will be O(n).
138
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Max Heapify
139
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Graphs:
A graph is a way of representing relationships that exist between pairs of objects.
That is, a graph is a set of objects, called vertices, together with a collection of pairwise
connections between them, called edges. It can also be represented as G=(V, E).
Vertex − Each node of the graph is represented as a vertex. In the following example, the
labelled circle represents vertices. Thus, A to E are vertices.
Edge − Edge represents a path between two vertices or a line between two vertices. In the
following example, the lines from A to B, B to E, and so on represents edges.
Adjacency − Two node or vertices are adjacent if they are connected to each other through
an edge. In the following example, B is adjacent to A, D is adjacent to B, and so on.
Path − Path represents a sequence of edges between the two vertices. In the following
example,
140
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Directed path - is a path such that all edges are directed and are traversed along their
direction.
Length - The no of edges in a path is called as length of the path in a graph. For example,
the length of the path (A,D) in a above graph is 2 because it contains two edges there are
(A,B),(B,D).
141
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
subgraph - A subgraph of a graph G is a graph H whose vertices and edges are subsets of
the vertices and edges of G, respectively.
Tree - A tree is a connected forest, that is, a connected graph without cycles.
Types of Graphs
1.Directed Graph
Directed graph is a graph in which edges are directed. Here each edge is unidirectional. In
directed graph the edges (A,C) is not same as (C,A). It is also called as digraph.
2. Undirected Graph
Undirected graph is a graph in which edges are undirected. Here each edge is Bidirectional.
In undirected graph, (A,C) = (C,A )
3. Weighted Graph
Weighted graph is a graph in which edges are assigned by some a weight or value.
This value is considered as cost/distance of traversing from one vertex to another vertex.
Weighted graph can be either directed or undirected.
142
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
4. Complete Graph
Complete graph is a graph in which there is an edge between each pair of vertices.
Here there is a path from each vertex to every other vertex. A complete graph with n vertices
should have n(n-1)/2 edges.
5. Cyclic Graph
Cyclic graph is a graph which has cycles. Cycle is a path which starts and ends at
same vertex.
6. Acyclic Graph
Acyclic graph is a graph in which does not have cycles in it. It is also called as Directed
Acyclic Graph(DAG).
143
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
1. Edge List:
It maintains an unordered list of all edges, but there is no efficient way to locate a
particular edge (u,v), or the set of all edges incident to a vertex v.
144
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
145
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
146
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Graph Traversals:
A traversal is a systematic procedure for exploring a graph by examining all of its
vertices and edges. A traversal is efficient if it visits all the vertices and edges in time
proportional to their number, that is, in linear time.
Graph traversal shows the notion of reachability. Reachability in an undirected graph
G include the following:
Computing a path from vertex u to vertex v, or reporting that no such path
exists.
Given a start vertex s of G, computing, for every vertex v of G, a path with the
minimum number of edges between s and v, or reporting that no such path
exists.
Testing whether G is connected.
Computing a spanning tree of G, if G is connected.
Computing a cycle in G, or reporting that G has no cycles.
147
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Example:
148
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Breadth-First Search
Traversing a connected component of a graph, known as a breadth-first search
(BFS)
Procedure:
A BFS proceeds in rounds and subdivides the vertices into levels.
BFS starts at vertex s, which is at level 0.
In the first round, we paint as “visited,” all vertices adjacent to the start vertex s-
these vertices are one step away from the beginning and are placed into level 1.
In the second round, we allow all explorers to go two steps (i.e., edges) away from
the starting vertex. These new vertices, which are adjacent to level 1 vertices and not
previously assigned to a level, are placed into level 2 and marked as “visited.”
This process continues in similar fashion, terminating when no new vertices are
found in a level.
Python Code:
def bfs(visited, graph, node):
visited.append(node)
queue.append(node)
while queue:
s = queue.pop(0)
print (s, end = " ")
for neighbour in graph[s]:
if neighbour not in visited:
visited.append(neighbour)
queue.append(neighbour)
bfs(visited, graph, 'A')
Example:
149
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
150
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
151
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
A B C D E F G H
A 0 0 0 0 0 0 0 0
B 0 0 0 0 0 0 0 0
C 1 0 0 0 0 0 0 0
D 3 2 1 1 0 0 0 0
E 1 1 0 0 0 0 0 0
F 2 2 2 2 1 0 0 0
G 2 2 2 1 1 1 0 0
H 3 3 2 2 2 2 1 0
A C E B D F G H
152
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Greedy Algorithm
An algorithm is designed to achieve optimum solution for a given problem . Greedy
algorithms try to find a localized optimum solution(i.e., a problem is solved by selecting the
best option available at the moment), which may eventually lead to globally optimized
solutions. It doesn't worry whether the current best result will bring the overall optimal result.
The algorithm never reverses the earlier decision even if the choice is wrong. It
works in a top-down approach.
This algorithm may not produce the best result(globally optimized solution) for all the
problems. It's because it always goes for the local best choice to produce the global best
result.
However, we can determine if the algorithm can be used with any problem if the
problem has the following properties:
1. Greedy Choice Property
If an optimal solution to the problem can be found by choosing the best choice at
each step without reconsidering the previous steps once chosen, the problem can be solved
using a greedy approach. This property is called greedy choice property.
2. Optimal Substructure
If the optimal overall solution to the problem corresponds to the optimal solution to its
subproblems, then the problem can be solved using a greedy approach. This property is
called optimal substructure.
Advantages of Greedy Approach
The algorithm is easier to describe.
This algorithm can perform better than other algorithms (but, not in all cases).
Drawback of Greedy Approach
As mentioned earlier, the greedy algorithm doesn't always produce the optimal
solution. This is the major disadvantage of the algorithm
For example, suppose we want to find the longest path in the graph below from root
to leaf. Let's use the greedy algorithm here.
Greedy Approach
1. Let's start with the root node 20. The weight of the right child is 3 and the weight of the left child is 2.
2. Our problem is to find the largest path. And, the optimal solution at the moment is 3. So, the greedy
algorithm will choose 3.
3. Finally the weight of an only child of 3 is 1. This gives us our final result 20 + 3 + 1 = 24.
153
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
However, it is not the optimal solution. There is another path that carries more weight (20 + 2 + 10 = 32)
as shown in the image below.
Longest path
Therefore, greedy algorithms do not always give an optimal/feasible solution.
Greedy Algorithm
1. To begin with, the solution set (containing answers) is empty.
2. At each step, an item is added to the solution set until a solution is reached.
3. If the solution set is feasible, the current item is kept.
4. Else, the item is rejected and never considered again.
Counting Coins
This problem is to count to a desired value by choosing the least possible coins and
the greedy approach forces the algorithm to pick the largest possible coin. If we are
provided coins of ₹ 1, 2, 5 and 10 and we are asked to count ₹ 18 then the greedy
procedure will be −
1 − Select one ₹ 10 coin, the remaining count is 8
2 − Then select one ₹ 5 coin, the remaining count is 3
3 − Then select one ₹ 2 coin, the remaining count is 1
4 − And finally, the selection of one ₹ 1 coins solves the problem
Though, it seems to be working fine, for this count we need to pick only 4 coins. But
if we slightly change the problem then the same approach may not be able to produce the
same optimum result.
For the currency system, where we have coins of 1, 7, 10 value, counting coins for
value 18 will be absolutely optimum but for count like 15, it may use more coins than
necessary. For example, the greedy approach will use 10 + 1 + 1 + 1 + 1 + 1, total 6 coins.
Whereas the same problem could be solved by using only 3 coins (7 + 7 + 1)
Hence, we may conclude that the greedy approach picks an immediate optimized
solution and may fail where global optimization is a major concern.
Examples
Most networking algorithms use the greedy approach. Here is a list of few of them −
Travelling Salesman Problem
Prim's Minimal Spanning Tree Algorithm
Kruskal's Minimal Spanning Tree Algorithm
Dijkstra's Minimal Spanning Tree Algorithm
Graph - Map Coloring
Graph - Vertex Cover
154
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Dynamic programming
Dynamic programming approach is similar to divide and conquer in breaking down
the problem into smaller and yet smaller possible sub-problems. But unlike, divide and
conquer, these sub-problems are not solved independently. Rather, results of these smaller
sub-problems are remembered and used for similar or overlapping sub-problems.
Dynamic programming is used where we have problems, which can be divided into
similar sub-problems, so that their results can be re-used. Mostly, these algorithms are used
for optimization. Before solving the in-hand sub-problem, dynamic algorithm will try to
examine the results of the previously solved sub-problems. The solutions of sub-problems
are combined in order to achieve the best solution.
So we can say that −
The problem should be able to be divided into smaller overlapping sub-problem.
An optimum solution can be achieved by using an optimum solution of smaller sub-
problems.
Dynamic algorithms use Memoization.
Comparison
In contrast to greedy algorithms, where local optimization is addressed, dynamic
algorithms are motivated for an overall optimization of the problem.
In contrast to divide and conquer algorithms, where solutions are combined to
achieve an overall solution, dynamic algorithms use the output of a smaller sub-problem and
then try to optimize a bigger sub-problem. Dynamic algorithms use Memoization to
remember the output of already solved sub-problems.
Example
The following computer problems can be solved using dynamic programming approach −
MultiStage Graph
All pair shortest path by Floyd-Warshall
Shortest path by Dijkstra
Single Source Shortest Path by Bellman Ford
Dynamic programming can be used in both top-down and bottom-up manner.
155
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Shortest Paths:
Breadth-first search strategy can be used to find a shortest path from some starting
vertex to every other vertex in a connected graph.
Dijkstra’s Algorithm:
Single-source shortest path problem is to perform a “weighted” breadth-first search
starting at the source vertex s.
In each iteration, the next vertex chosen is the vertex outside the cloud that is closest
to s. The algorithm terminates when no more vertices are outside the cloud.
Applying the greedy method to the single-source shortest-path problem, results in an
algorithm known as Dijkstra’s algorithm.
Edge Relaxation:
Procedure:
• Assign the source node as S and Enqueue S.
• Dequeue the vertex S from queue and assign the value of that vertex to be known
and then find its adjacency vertices.
• If the distance of the adjacent vertices is equal to infinity then change the distance of
that vertex as the distance of its source vertex. Increment by 1 and enqueue the
vertex.
• Repeat step ii until the queue becomes empty.
156
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Algorithm ShortestPath(G,s):
#Input: A weighted graph G with nonnegative edge weights, and a #distinguished vertex s of
G.
#Output: The length of a shortest path from s to v for each vertex v of G.
#Initialize D[s] = 0 and D[v] = ∞ for each vertex v = s.
#Let a priority queue Q contain all the vertices of G using the D labels as keys.
while Q is not empty do
u = value returned by Q.remove min() #{pull a new vertex u into the cloud}
for each vertex v adjacent to u such that v is in Q do
if D[u] +w(u,v) < D[v] then #{perform the relaxation procedure on edge (u,v)}
D[v] = D[u] +w(u,v)
Change to D[v] the key of vertex v in Q.
return the label D[v] of each vertex v
Solution:
1. v1 is taken as source.
157
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
2. Now v1 is known vertex, marked as 1. Its adjacent vertices are v2, v4, pv and dv values
are updated
T[v2]. dist = Min (T[v2].dist, T[v1].dist + Cv1, v2) = Min (α , 0+2) = 2
T[v4]. dist = Min (T[v4].dist, T[v1].dist + Cv1, v4) = Min (α , 0+1) = 1
3. Select the vertex with minimum distance away v2 and v4. V4 is marked as known vertex.
Its adjacent vertices are v3, v5, v6 and v7 .
T[v3]. dist = Min (T[v3].dist, T[v4].dist + Cv4, v3) = Min (α , 1+2) = 3
T[v5]. dist = Min (T[v5].dist, T[v4].dist + Cv4, v5) = Min (α , 1+2) = 3
T[v6]. dist = Min (T[v6].dist, T[v4].dist + Cv4, v6) = Min (α , 1+8) = 9
T[v7]. dist = Min (T[v7].dist, T[v4].dist + Cv4, v7) = Min (α , 1+4) = 5
4. Select the vertex which is shortest distance from source v1. v2 is smallest one. v2 is
marked as known vertex. Its adjacent vertices are v4 ad v5. The distance from v1 to v4 and
v5 through v2 is more comparing with previous value of dv. No change in dv and pv value.
158
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
5. Select the next smallest vertex from source. v3 and v5 are smallest one. Adjacent vertices
for v3 are v1 and v6. v1 is source there is no change in dv and pv
T[v6]. dist = Min (T[v6].dist, T[v3].dist + Cv3, v6) = Min (9 , 3+5) = 8
dv and pv values are updated. Adjacent vertices for v5 are v7. No change in dv and pv
value.
159
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
7. The last vertex v6 is declared as known. No adjacent vertices for v6. No updation in the
table.
Procedure:
o We begin with some vertex s,
o defining the initial “cloud” of vertices C.
o Then, in each iteration, choose a minimum-weight edge e = (u,v), connecting
a vertex u in the cloud C to a vertex v outside of C.
o The vertex v is then brought into the cloud C and the process is repeated until
a spanning tree is formed.
160
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Algorithm PrimJarnik(G):
Input: An undirected, weighted, connected graph G with n vertices and m edges
Output: A minimum spanning tree T for G
Pick any vertex s of G
INF = 9999999
V=5
selected = [0, 0, 0, 0, 0]
no_edge = 0
selected[0] = True
print("Edge : Weight\n")
while (no_edge < V - 1):
minimum = INF
x=0
y=0
for i in range(V):
if selected[i]:
for j in range(V):
if ((not selected[j]) and G[i][j]):
# not in selected and there is an edge
if minimum > G[i][j]:
minimum = G[i][j]
x=i
y=j
print(str(x) + "-" + str(y) + ":" + str(G[x][y]))
selected[y] = True
no_edge += 1
161
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
Example
162
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
5. What is a loop?
An edge of a graph which connects to itself is called a loop or sling.
163
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
164
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
165
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
3. Adjacency Map - is very similar to an adjacency list, but the secondary container
of all edges incident to a vertex is organized as a map.
4. Adjacency Matrix - Each entry is dedicated to storing a reference to the edge
(u,v) for a particular pair of vertices u and v; if no such edge exists, the entry will
be None.
21. List the two important key points of depth first search.
i) If path exists from one node to another node, walk across the edge – exploring the edge.
ii) If path does not exist from one specific node to any other node, return to the previous
node where we have been before – backtracking.
166
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
167
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091
168
Downloaded by s h s g (pgpcet2022@gmail.com)