Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

cd3291 Dsa Study Material

Download as pdf or txt
Download as pdf or txt
You are on page 1of 169

lOMoARcPSD|42733091

CD3291- DSA Study Material

Data Structures (Anna University)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university


Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

DEPARTMENT OF INFORMATION TECHNOLOGY

ACADEMIC YEAR 2022-23

CD3291 – DATA STRUCTURES AND


ALGORITHMS

STUDY MATERIAL

1
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

AD3251 DATA STRUCTURES DESIGN


COURSE OBJECTIVES:
● To understand the concepts of ADTs
● To design linear data structures – lists, stacks, and queues
● To understand sorting, searching and hashing algorithms
● To apply Tree and Graph structures
UNIT I ABSTRACT DATA TYPES 9
Abstract Data Types (ADTs) – ADTs and classes – introduction to OOP – classes in
Python – inheritance – namespaces – shallow and deep copying. Introduction to analysis of
algorithms – asymptotic notations –Divide and Conquer– recursion – analyzing recursive
algorithms
UNIT II LINEAR STRUCTURES 9
List ADT – array-based implementations – linked list implementations – singly linked
lists – circularly linked lists – doubly linked lists – Stack ADT – Queue ADT – double ended
queues – applications UNIT
UNIT III SORTING AND SEARCHING 9
Bubble sort – selection sort – insertion sort – merge sort – quick sort – analysis of
sorting algorithms - linear search – binary search – hashing – hash functions – collision
handling – load factors, rehashing, and efficiency
UNIT IV TREE STRUCTURES 9
Tree ADT – Binary Tree ADT – tree traversals – binary search trees – AVL trees –
heaps – multiway search trees
UNIT V GRAPH STRUCTURES 9
Graph ADT – representations of graph – graph traversals – DAG – topological ordering
– Greedy algorithms – dynamic programming – shortest paths – minimum spanning trees –
introduction to complexity classes and intractability

TOTAL: 45 HOURS
COURSE OUTCOMES:
At the end of the course, the student should be able to:
 Explain abstract data types.
 Design, implement, and analyse linear data structures, such as lists, queues, and
stacks, according to the needs of different applications.
 Design, implement, and analyse efficient tree structures to meet requirements such
as searching, indexing, and sorting.
 Model problems as graph problems and implement efficient graph algorithms to solve
them.
TEXT BOOKS:
1. Michael T. Goodrich, Roberto Tamassia, and Michael H. Goldwasser, “Data Structures
and Algorithms in Python” (An Indian Adaptation), Wiley, 2021.
2. Lee, Kent D., Hubbard, Steve, “Data Structures and Algorithms with Python” Springer
Edition 2015.
3. Narasimha Karumanchi, “Data Structures and Algorithmic Thinking with Python”
Careermonk, 2015.

2
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

UNIT I - ABSTRACT DATA TYPES

Variables

Variables are placeholders for representing data. In computer programming, variables


are used to hold data.

Ex: x2+2y-2=1

Data Types

A data type in a programming language is a set of data with predefined values.


Examples of data types are: integer-2 Bytes, floating point-4 Bytes, unit number, character,
string, etc.

At the top level, there are two types of data types:

• System-defined data types (also called Primitive data types)

• User-defined data types

System-defined data types (Primitive data types)


Data types that are defined by system are called primitive data types. The primitive
data types provided by many programming languages are: int, float, char, double, bool, etc.

The number of bits allocated for each primitive data type depends on the programming
languages, the compiler and the operating system.

For the same primitive data type, different languages may use different sizes.
Depending on the size of the data types, the total available values (domain) will also change.

For example, “int” may take 2 bytes or 4 bytes. If it takes 2 bytes (16 bits), then the
total possible values are minus 32,768 to plus 32,767 (-215 to 215-1). If it takes 4 bytes (32
bits), then the possible values are between -2,147,483,648 and +2,147,483,647 (-231 to 231-
1). The same is the case with other data types.

3
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

User defined data types

If the system-defined data types are not enough, then most programming languages
allow the users to define their own data types, called user – defined data types.

Good examples of user defined data types are: structures in C/C + + and classes in
Java.

For example, in the snippet below, we are combining many system-defined data types
and calling the user defined data type by the name “newType”.

This gives more flexibility and comfort in dealing with computer memory.

struct newType
{
int data1;
float data2;
.
.
.
char datan;
};

Data Structures
Data structure is a particular way of storing and organizing data in a computer so that
it can be used efficiently.

A data structure is a special format for organizing and storing data. General data
structure types include arrays, files, linked lists, stacks, queues, trees, graphs and so on.

Depending on the organization of the elements, data structures are classified into two types:

1) Linear data structures: Elements are accessed in a sequential order but it is not
compulsory to store all elements sequentially. Examples: Linked Lists, Stacks and Queues.

2) Non – linear data structures: Elements of this data structure are stored/accessed
in a non-linear order. Examples: Trees and graphs.

Note: For system-defined data types, by default the system supports


implementations/operations like addition, subtraction etc.

4
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Abstract Data Types (ADTs)


For user-defined data types we also need to define operations. The implementation for
these operations can be done when we want to actually use them. That means, in general,
user defined data types are defined along with their operations.

To simplify the process of solving problems, we combine the data structures with their
operations and we call this Abstract Data Types (ADTs). An ADT consists of two parts:

1. Declaration of data
2. Declaration of operations

Commonly used ADTs include: Linked Lists, Stacks, Queues, Priority Queues, Binary
Trees, Dictionaries, Disjoint Sets (Union and Find), Hash Tables, Graphs, and many others.

Object-Oriented Design Goals:


Software implementations should achieve robustness, adaptability, and reusability.

Robustness
A program produces the right output for all the anticipated inputs in the program’s
application. In addition, we want software to be robust, that is, capable of handling
unexpected inputs that are not explicitly defined for its application.

For example, if a program is expecting a positive integer and instead is given a


negative integer, then the program should be able to recover gracefully from this error.

Adaptability
Software, therefore, needs to be able to evolve over time in response to changing
conditions in its environment. Thus, another important goal of quality software is that it
achieves adaptability (also called evolvability).

Related to this concept is portability, which is the ability of software to run with minimal
change on different hardware and operating system platforms. An advantage of writing
software in Python is the portability provided by the language itself.
Reusability
Developing quality software can be an expensive enterprise, and its cost can be offset
somewhat if the software is designed in a way that makes it easily reusable in future
applications.

Such reuse should be done with care, however, for one of the major sources of
software errors in the Therac-25 came from inappropriate reuse of Therac-20 software.

5
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Object-Oriented Design Principles


Chief principles of object-oriented approach is,
 Modularity
 Abstraction
 Encapsulation
Modularity
Modularity refers to an organizing principle in which different components of a
software system are divided into separate functional units.

Using modularity in a software system can also provide a powerful organizing


framework that brings clarity to an implementation.

In Python, we have already seen that a module is a collection of closely related


functions and classes that are defined together in a single file of source code.

Python’s standard libraries include, for example, the math module, which provides
definitions for key mathematical constants and functions, and the os module, which provides
support for interacting with the operating system.

Abstraction

Applying the abstraction paradigm to the design of data structures gives rise to abstract
data types (ADTs). An ADT is a mathematical model of a data structure that specifies the type
of data stored, the operations supported on them, and the types of parameters of the
operations, the collective set of behaviours supported by an ADT designed as public interface.

Python supports abstract data types using a mechanism known as an abstract base
class (ABC). An abstract base class cannot be instantiated (i.e., you cannot directly create an
instance of that class), but it defines one or more common methods that all implementations
of the abstraction must have.

An ABC is realized by one or more concrete classes that inherit from the abstract base
class while providing implementations for those method declared by the ABC.

Encapsulation

Another important principle of object-oriented design is encapsulation. Different


components of a software system should not reveal the internal details of their
respective implementations.

One of the main advantages of encapsulation is that it gives one programmer freedom
to implement the details of a component, without concern that other programmers will be
writing code that intricately depends on those internal decisions.

6
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Encapsulation yields robustness and adaptability, for it allows the implementation


details of parts of a program to change without adversely affecting other parts, thereby making
it easier to fix bugs or add new functionality with relatively local changes to a component.

Class Definitions

A class serves as the primary means for abstraction in object-oriented programming.


In Python, every piece of data is represented as an instance of some class.

A class provides a set of behaviors in the form of member functions (also known as
methods), with implementations that are common to all instances of that class.

A class also serves as a blueprint for its instances, effectively determining the way
that state information for each instance is represented in the form of attributes (also known as
fields, instance variables, or data members)

Defining a Class:

Like function definitions begin with the def keyword in Python, class definitions begin
with a class keyword.

The first string inside the class is called docstring and has a brief description of the
class. Although not mandatory, this is highly recommended.

Here is a simple class definition:

class MyNewClass:
'''This is a docstring. I have created a new class'''
Pass
A class creates a new local namespace where all its attributes are defined. Attributes
may be data or functions.

There are also special attributes in it that begins with double underscores . For
example, doc gives us the docstring of that class.

As soon as we define a class, a new class object is created with the same name. This
class object allows us to access the different attributes as well as to instantiate new objects of
that class.

Example:
class Person:
"This is a person class"
age = 10

def greet(self):
print('Hello')

print(Person.age)
print(Person.greet)

7
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

print(Person. doc )

Output:
10
<function Person.greet at 0x7fc78c6e8160>
This is a person class

Creating an Object in Python

We saw that the class object could be used to access different attributes. It can also
be used to create new object instances (instantiation) of that class. The procedure to create
an object is similar to a function call.

>>> harry = Person()

This will create a new object instance named harry. We can access the attributes of
objects using the object name prefix.

Attributes may be data or method. Methods of an object are corresponding functions


of that class. This means to say, since Person.greet is a function object (attribute of class),
Person.greet will be a method object.

Example:
class CreditCard:
def init (self, customer, bank, acnt, limit):
self. customer = customer
self. bank = bank
self. account = acnt
self. limit = limit
self. balance = 0
def get customer(self):
return self. Customer
def get bank(sf):
return self. Bank

def get account(self):


return self. Account
def get limit(self):
return self. Limit

def get balance(self):


return self. Balance
def charge(self, price):
if price + self. balance > self. limit: # if charge would exceed limit,
return False # cannot accept charge
else:
self. balance += price
return True
def make payment(self, amount):
self. balance −= amount

8
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

The Constructor
init method that serves as the constructor of the class. Its primary responsibility
is to establish the state of a newly created credit card object with appropriate instance
variables.

Encapsulation
A single leading underscore in the name of a data member, such as balance, implies
that it is intended as non-public. Users of a class should not directly access such members.

In the context of data structures, encapsulating the internal representation allows us


greater flexibility to redesign the way a class works, perhaps to improve the efficiency of the
structure.

Additional Methods
The most interesting behaviours in our class are charge and make payment. The
charge function typically adds the given price to the credit card balance, to reflect a purchase
of said price by the customer.

Inheritance:

In object-oriented programming, the mechanism for a modular and hierarchical


organization is a technique known as inheritance. This allows a new class to be defined
based upon an existing class as the starting point. In object-oriented terminology, the existing
class is typically described as the base class, parent class, or superclass, while the newly
defined class is known as the subclass or child class.

There are two ways in which a subclass can differentiate itself from its superclass. A
subclass may specialize an existing behaviour by providing a new implementation that
overrides an existing method. A subclass may also extend its superclass by providing brand
new methods.

Syntax:

class BaseClass:
Body of base class

class DerivedClass(BaseClass):
Body of derived class

Derived class inherits features from the base class where new features can be added to
it. This results in re-usability of code.

9
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Types of Inheritance
Depending upon the number of child and parent classes involved, there are four types
of inheritance in python.

Single Inheritance
When a child class inherits only a single parent class.
Example:
class Parent:
def func1(self):
print("this is function one")
class Child(Parent):
def func2(self):
print(" this is function 2 ")
ob = Child()
ob.func1()
ob.func2()

Multiple Inheritance
When a child class inherits from more than one parent class.
Example:
class Parent:
def func1(self):
print("this is function 1")
class Parent2:
def func2(self):
print("this is function 2")
class Child(Parent , Parent2):
def func3(self):
print("this is function 3")
ob = Child()
ob.func1()
ob.func2()
ob.func3()

10
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Multilevel Inheritance
When a child class becomes a parent class for another child class.
Example:
class Parent:
def func1(self):
print("this is function 1")
class Child(Parent):
def func2(self):
print("this is function 2")
class Child2(Child):
def func3("this is function 3")
ob = Child2()
ob.func1()
ob.func2()
ob.func3()

Hierarchical Inheritance
Hierarchical inheritance involves multiple inheritance from the same base or parent
class.
Example:
class Parent:
def func1(self):
print("this is function one")
class Child(Parent):
def func2(self):
print("this is function 2")
class Child1(Parent):
def func3(self):
print(" this is function 3"):
class Child3(Parent , Child1):
def func4(self):
print(" this is function 4")
ob = Child3()
ob.func1()

Scopes and Namespaces


Whenever an identifier is assigned to a value that definition is made with a specific
scope. Top-level assignments are typically made in what is known as global scope.
Assignments made within the body of a function typically have scope that is local to that

11
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

function call. Therefore, an assignment, x=5, within a function has no effect on the identifier,
x, in the broader scope. Each distinct scope in Python is represented using an abstraction
known as a namespace. A namespace manages all identifiers that are currently defined in a
given scope.
The process of determining the value associated with an identifier is known as name
resolution.
In a Python program, there are three types of namespaces:

1. Built-In
2. Global
3. Local

These have differing lifetimes. As Python executes a program, it creates namespaces as


necessary and deletes them when they’re no longer needed. Typically, many namespaces will
exist at any given time.

12
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

a. Built-in Namespace in Python

This namespace gets created when the interpreter starts. It stores all the keywords or
the built-in names. This is the superset of all the Namespaces. This is the reason we can use
print, True, etc. from any part of the code.

b. Python Global Namespace

This is the namespace that holds all the global objects. This namespace gets created
when the program starts running and exists till the end of the execution.

Example of Global Namespace in Python:


text="PythonGeeks"
def func():
print(text)
func()
print(text)

Output:
PythonGeeks
PythonGeeks

c. Python Local Namespace

This is the namespace that generally exists for some part of the time during the
execution of the program. This stores the names of those objects in a function. These
namespaces exist as long as the functions exist. This is the reason we cannot globally access
a variable, created inside a function.

Example of local namespace:


var1="PythonGeeks"
def func():
var2="Python"
print(var2)
func()

Output:
Python

13
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Copy an Object in Python


In Python, we use = operator to create a copy of an object. It only creates a new variable
that shares the reference of the original object.

Let's take an example where we create a list named old_list and pass an object
reference to new_list using = operator.
Example:
#Copy using = operator
old_list = [[1, 2, 3], [4, 5, 6], [7, 8, 'a']]
new_list = old_list
The output will be:
new_list[2][2] = 9
print('Old List:', old_list) ld List: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
print('ID of Old List:', id(old_list))
D of Old List: 140673303268168
print('New List:', new_list)
ew List: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
print('ID of New List:', id(new_list))
D of New List: 140673303268168

The output both variables old_list and new_list shares the same id i.e
140673303268168. So, the changes in new_list or old_list, will be visible in both.
Essentially, sometimes you may want to have the original values unchanged and only modify
the new values or vice versa. In Python, there are two ways to create copies:
 Shallow Copy
 Deep Copy
To make these copy work, copy module is used.
For example:
import copy
copy.copy(x)
copy.deepcopy(x)

Here, the copy() return a shallow copy of x. Similarly, deepcopy() return a deep copy of x.

14
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Shallow Copy
A shallow copy creates a new object which stores the reference of the original
elements. So, a shallow copy doesn't create a copy of nested objects, instead it just copies
the reference of nested objects. This means, a copy process does not recurse or create copies
of nested objects itself.
Example: Create a copy using shallow copy
import copy
old_list = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
new_list = copy.copy(old_list)
print("Old list:", old_list)
print("New list:", new_list)
The output will be:
Old list: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
New list: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

Example: Adding [4, 4, 4] to old_list, using shallow copy


import copy
old_list = [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
new_list = copy.copy(old_list)
old_list.append([4, 4, 4])
print("Old list:", old_list)
print("New list:", new_list)
Output:
Old list: [[1, 1, 1], [2, 2, 2], [3, 3, 3], [4, 4, 4]]
New list: [[1, 1, 1], [2, 2, 2], [3, 3, 3]]

In the above program, shallow copy of old_list is created. The new_list contains
references to original nested objects stored in old_list. Then we add the new list i.e [4, 4, 4]
into old_list. This new sublist was not copied in new_list.

Example: Adding new nested object using Shallow copy


import copy
old_list = [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
new_list = copy.copy(old_list)
old_list[1][1] = 'AA'
print("Old list:", old_list)
print("New list:", new_list)
Output:
Old list: [[1, 1, 1], [2, 'AA', 2], [3, 3, 3]]
New list: [[1, 1, 1], [2, 'AA', 2], [3, 3, 3]]

15
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

In the above program, changes to old_list i.e old_list[1][1] = 'AA' affects both sublists
old_list and new_list at index [1][1]. This is because, both lists share the reference of same
nested objects.
Deep Copy
A deep copy creates a new object and recursively adds the copies of nested objects
present in the original elements.

Example: Copying a list using deepcopy()


import copy
old_list = [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
new_list = copy.deepcopy(old_list)
print("Old list:", old_list)
print("New list:", new_list)
Output:
Old list: [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
New list: [[1, 1, 1], [2, 2, 2], [3, 3, 3]]

In the above program, changes to any nested objects in original object old_list, makes
changes to the copy new_list.

Example: Adding a new nested object in the list using Deep copy
import copy
old_list = [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
new_list = copy.deepcopy(old_list)
old_list[1][0] = 'BB'
print("Old list:", old_list)
print("New list:", new_list)
Output:
Old list: [[1, 1, 1], ['BB', 2, 2], [3, 3, 3]]
New list: [[1, 1, 1], [2, 2, 2], [3, 3, 3]]

In the above program, changes in old_list, makes changes only in the old_list. This
means, both the old_list and the new_list are independent. This is because the old_list was
recursively copied, which is true for all its nested objects.

16
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Introduction to Analysis of Algorithms

Data structure is a systematic way of organizing and accessing data, and an algorithm
is a step-by-step procedure for performing some task in a finite amount of time. Algorithm
analysis helps us to determine which algorithm is most efficient in terms of time and space
consumed.

The running time of an algorithm can be calculated by executing it on various test


inputs and recording the time spent during each execution.

from time import time


start time = time( ) # record the starting time
run algorithm
end time = time( ) # record the ending time
elapsed = end time − start time # compute the elapsed time
Challenges of Experimental Analysis

 Experimental running times of two algorithms are difficult to directly compare unless
the experiments are performed in the same hardware and software environments.
 Experiments can be done only on a limited set of test inputs; hence, they leave out
the running times of inputs not included in the experiment (and these inputs may be
important).
 An algorithm must be fully implemented in order to execute it to study its running time
experimentally.

Moving Beyond Experimental Analysis

Our goal is to develop an approach to analyzing the efficiency of algorithms that:

1. Allows us to evaluate the relative efficiency of any two algorithms in a way that is
independent of the hardware and software environment.

2. Takes into account all possible inputs.

3. Is performed by studying a high-level description of the algorithm without need for


implementation.

Types of Analysis

Algorithm analysis depends on which inputs the algorithm takes less time
(performing wel1) and with which inputs the algorithm takes a long time.

Worst case
 Defines the input for which the algorithm takes a long time (slowest time to
complete).
 Input is the one for which the algorithm runs the slowest.

17
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Best case
 Defines the input for which the algorithm takes the least time (fastest time to
complete).
 Input is the one for which the algorithm runs the fastest.
Average case
 Provides a prediction about the running time of the algorithm.
 Run the algorithm many times, using many different inputs and divide by the
number of trials.
 Assumes that the input is random.

Lower Bound <= Average Time <= Upper Bound

Asymptotic Notation

For the best, average and worst cases, we need to identify the upper and lower
bounds. To represent these upper and lower bounds, we need some kind of syntax,
represented in the form of function f(n).

Big-O Notation [Upper Bounding Function]

This notation gives the tight upper bound of the given function. Thus, it gives the
worst-case complexity of an algorithm.

O(g(n)) = { f(n): there exist positive constants c


and n0 such that 0 ≤ f(n) ≤ cg(n) for all n ≥ n0 }

Generally, it is represented as f(n) = O(g(n)). That means, at larger values of n, the


upper bound of f(n) is g(n).

For example, if f(n) = n 4 + 100n 2 + 10n + 50 is the given algorithm, then n 4 is g(n).
That means g(n) gives the maximum rate of growth for f(n) at larger values of n.

18
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Big-O Examples

Example-1 Find upper bound for f(n) = 3n + 8

Solution: 3n + 8 ≤ 4n, for all n ≥ 8


∴ 3n + 8 = O(n) with c = 4 and n0 = 8
Example-2 Find upper bound for f(n) = n2 + 1
Solution: n2 + 1 ≤ 2n2, for all n ≥ 1
∴ n2 + 1 = O(n2) with c = 2 and n0 = 1
Big Omega Notation [Lower Bounding Function]

This notation gives the tighter lower bound of the given algorithm and we represent it
as f(n) = Ω(g(n)). Thus, it provides the best-case complexity of an algorithm.

Ω(g(n)) = { f(n): there exist positive constants c


and n0 such that 0 ≤ cg(n) ≤ f(n) for all n ≥ n0 }

That means, at larger values of n, the tighter lower bound of f(n) is g(n). For example,
if f(n) = 100n2 + 10n + 50, g(n) is Ω(n2 ).

Example : 3nlog n−2n is Ω(nlog n).


Solution: 3nlog n− 2n = nlog n+ 2n(logn− 1) ≥ nlogn for n ≥ 2; hence, we can take c =1
and n0 = 2 in this case.

19
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Big-Theta

There is a notation that allows us to say that two functions grow at the same rate, up
to constant factors(ie., It encloses the function from above and below). Since it represents the
upper and the lower bound of the running time of an algorithm, it is used for sanalyzing the average-case
complexity of an algorithm.

Θ(g(n)) = { f(n): there exist positive constants


c1, c2 and n0 such that 0 ≤ c1g(n) ≤ f(n) ≤ c2g(n)
for all n ≥ n0 }

f(n) is Θ(g(n)), pronounced “ f(n) is big-Theta of g(n),” if f(n) is O(g(n)) and f(n) is
Ω(g(n)) , that is, there are real constants c > 0 and c > 0, and an integer constant n0 ≥ 1 such
that

Example: 3nlog n+4n+5logn is Θ(nlog n).


Solution: 3nlogn ≤ 3nlog n+4n+5logn ≤ (3+4+5) nlogn for n≥2.
Asymptotic Analysis

There are some general rules to help us determine the running time of an algorithm.

1) Loops: The running time of a loop is, at most, the running time of the statements inside the
loop (including tests) multiplied by the number of iterations.

Example: // Executes n times


For(i=1;i<=n;i++)
M=m+2 //constant time, c

Total time = a constant c × n = c n = O(n).

20
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

2) Nested loops: Analyze from the inside out. Total running time is the product of the sizes of
all the loops.

Example:

//outer loop
For(i=1;i<=n;i++)
For(j=1;j<=n;j++)
K=k+1 //constant time, c
Total time = c × n × n = c n2 = O(n2 ).

3) Consecutive statements: Add the time complexities of each statement

Example:

X=x+1

For(i=1;i<=n;i++)
M=m+2 //constant time, c
//outer loop
For(i=1;i<=n;i++)
For(j=1;j<=n;j++)
K=k+1 //constant time, c

Total time = c0 + c1n + c2n2 = O(n2 ).

4) If-then-else statements:

Worst-case running time: the test, plus either the then part or the else part (whichever is the
larger).

//test : constant
If(length()==0)
Return false;
Else:
For(int n=0;n<length();n++)
If(!list[n].equals(otherList.list[n]))
Return false;
Total time = c0 + c1 + (c2 + c3 ) * n = O(n).

21
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Divide and Conquer


A divide and conquer algorithm is a strategy of solving a large problem by
1. breaking the problem into smaller sub-problems
2. solving the sub-problems, and
3. combining them to get the desired output.
To use the divide and conquer algorithm, recursion is used.

How Divide and Conquer Algorithms Work?


Here are the steps involved:
1. Divide(Break): Divide the given problem into sub-problems using recursion.
2. Conquer(Solve): Solve the smaller sub-problems recursively. If the subproblem is small enough, then
solve it directly.
3. Combine(Merge): Combine the solutions of the sub-problems that are part of the recursive process to
solve the actual problem.

Examples
The following program is an example of divide-and-conquer programming approach
where the binary search is implemented using python.
Binary Search implementation
In binary search we take a sorted list of elements and start looking for an element at
the middle of the list. If the search value matches with the middle value in the list we
complete the search. Otherwise we eleminate half of the list of elements by choosing
whether to procees with the right or left half of the list depending on the value of the item
searched.
This is possible as the list is sorted and it is much quicker than linear search.Here we
divide the given list and conquer by choosing the proper half of the list. We repeat this
approcah till we find the element or conclude about it's absence in the list.

22
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Example
def bsearch(list, val):
list_size = len(list) - 1
idx0 = 0
idxn = list_size
# Find the middle most value
while idx0 <= idxn:
midval = (idx0 + idxn)// 2
if list[midval] == val:
return midval
# Compare the value the middle most value
if val > list[midval]:
idx0 = midval + 1
else:
idxn = midval - 1
if idx0 > idxn:
return None
# Initialize the sorted list
list = [2,7,19,34,53,72]

# Print the search result


print(bsearch(list,72))
print(bsearch(list,11))
Output

When the above code is executed, it produces the following result –

5
None

Time Complexity

The complexity of the divide and conquer algorithm is calculated using the master
theorem.
T(n) = aT(n/b) + f(n),
where,
n = size of input
a = number of subproblems in the recursion
n/b = size of each subproblem. All subproblems are assumed to have the same size.
f(n) = cost of the work done outside the recursive call, which includes the cost of
dividing the problem and cost of merging the solutions

23
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Advantages of Divide and Conquer Algorithm

 The complexity for the multiplication of two matrices using the naive method is O(n3), whereas using the
divide and conquer approach (i.e. Strassen's matrix multiplication) is O(n2.8074). This approach also
simplifies other problems, such as the Tower of Hanoi.

 This approach is suitable for multiprocessing systems.

 It makes efficient use of memory caches.

Divide and Conquer Applications

 Merge Sort
 Quick Sort
 Binary Search
 Strassen's Matrix Multiplication
 Closest pair (points)

24
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Recursion

Recursion is a technique by which a function makes one or more calls to itself during
execution, until the condition gets satisfied. Recursion provides a powerful alternative for
performing repetitive tasks.

Example

 The factorial function (commonly denoted as n!) is a classic mathematical function that
has a natural recursive definition.
 An English ruler has a recursive pattern that is a simple example of a fractal structure.
 Binary search is among the most important computer algorithms. It allows us to
efficiently locate a desired value in a data set with upwards of billions of entries.
 The file system for a computer has a recursive structure in which directories can be
nested arbitrarily deeply within other directories. Recursive algorithms are widely used
to explore and manage these file systems.

1) The Factorial Function

The factorial of a positive integer n, denoted n!, is defined as the product of the integers
from 1 to n. If n = 0, then n! is defined as 1 by convention. More formally, for any integer n ≥
0.

The factorial function is used to find the number of ways in which n distinct items can
be arranged into a sequence, that is, the number of permutations of n items. For example, the
three characters a, b, and c can be arranged in 3! = 3 · 2 · 1 = 6 ways: abc, acb, bac, bca,
cab, and cba.

A Recursive Implementation of the Factorial Function

Recursion is not just a mathematical notation; we can use recursion to design a Python
implementation of a factorial function, as shown in Code Fragment 4.1.

def factorial(n):
if n == 0:
return 1
else:
return n*factorial(n−1)

25
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Trace for the factorial function is,

A recursion trace for the call factorial(5)

2) Drawing an English Ruler

This is to draw the markings of a typical English ruler. For each inch, we place a tick
with a numeric label. We denote the length of the tick designating a whole inch as the major
tick length. Between the marks for whole inches, the ruler contains a series of minor ticks,
placed at intervals of 1/2 inch, 1/4 inch, and so on.

In general, an interval with a central tick length L ≥ 1 is composed of:

 An interval with a central tick length L−1


 A single tick of length L
 An interval with a central tick length L−1

26
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Python Code:

def draw_line(tick_length, tick_label=' '):

"""Draw one line with given tick length (followed by optional label)."""

line = '-'* tick_length

if tick_label:

line += ''+ tick_label Output

print(line)

def drawinterval(centerlength):

"""Draw tick interval based upon a central tick length."""

if centerlength > 0: # stop when length drops to 0

drawinterval(centerlength-1) # recursively draw top ticks

draw_line(centerlength) # draw center tick

drawinterval(centerlength-1)

def drawruler(numinches, majorlength):

"""DrawEnglish ruler with given number of inches, major tick length."""

draw_line(majorlength, 0 ) # draw inch 0 line

for j in range(1, 1 + numinches):

drawinterval(majorlength-1) # draw interior ticks for inch

draw_line(majorlength, str(j)) # draw inch j line and label

drawruler(2,4)

27
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Trace of the English Ruler Code:

3) Binary Search

Binary search, that is used to efficiently locate a target value within a sorted sequence
of n elements.

Values stored in sorted order within an indexable sequence,

such as a Python list. The numbers at top are the indices.

The algorithm maintains two parameters, low and high, such that all the candidate
entries have index at least low and at most high. Initially, low = 0 and high = n− 1. We then
compare the target value to the median candidate, that is, the item data[mid] with index

mid = (low +high)/2

We consider three cases:

 If the target equals data[mid], then we have found the item we are looking for,
and the search terminates successfully.
 If target < data[mid], then we recur on the first half of the sequence, that is, on
the interval of indices from low to mid−1.
 If target > data[mid], then we recur on the second half of the sequence, that is,
on the interval of indices from mid+1 to high.

28
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

 An unsuccessful search occurs if low > high, as the interval [low,high] is empty.

This algorithm is known as binary search. Whereas sequential search runs in O(n)
time, the more efficient binary search runs in O(logn) time.

4) File Systems

Modern operating systems define file-system directories (which are also sometimes
called “folders”) in a recursive way. Namely, a file system consists of a top-level directory, and
the contents of this directory consists of files and other directories, which in turn can contain
files and other directories, and so on.

29
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Analyzing Recursive Algorithm

Efficiency of the algorithm calculated as big-Oh to summarize the relationship between


the number of operations and the input size for a problem.

With a recursive algorithm, we will account for each operation that is performed based
upon the particular activation of the function that manages the flow of control at the time it is
executed. Stated another way, for each invocation of the function, we only account for the
number of operations that are performed within the body of that activation. We can then
account for the overall number of operations that are executed as part of the recursive
algorithm by taking the sum, over all activations, of the number of operations that take place
during each individual activation

Computing Factorials

It is relatively easy to analyze the efficiency of our function for computing factorials.
Sample recursion trace is,

To compute factorial(n), there are a total of n+1 activations, as the parameter


decreases from n in the first call, to n−1 in the second call, and so on, until reaching the base
case with parameter 0. Each individual activation of factorial executes a constant number of
operations. Therefore, the overall number of operations for computing factorial(n) is O(n), as
there are n+1 activations, each of which accounts for O(1) operations.

Drawing an English Ruler

In analyzing the English ruler application, the fundamental question of how many total
lines of output are generated by an initial call to draw interval(c), where c denotes the center
length. This is a reasonable benchmark for the overall efficiency of the algorithm as each line
of output is based upon a call to the draw line utility, and each recursive call to draw interval
with nonzero parameter makes exactly one direct call to draw line. Some intuition may be
gained by examining the source code and the recursion trace. We know that a call to draw
interval(c) for c > 0 spawns two calls to draw interval(c−1) and a single call to draw line. We
will rely on this intuition to prove the following claim. Proposition 4.1: For c ≥ 0, a call to draw
interval(c) results in precisely 2c − 1 lines of output.

30
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Justification:

In fact, induction is a natural mathematical technique for proving the correctness and
efficiency of a recursive process. In the case of the ruler, we note that an application of draw
interval(0) generates no output, and that 20 −1 = 1−1 = 0. This serves as a base case for our
claim. More generally, the number of lines printed by draw interval(c) is one more than twice
the number generated by a call to draw interval(c−1), as one center line is printed between
two such recursive calls. By induction, we have that the number of lines is thus 1+2 ·(2c−1 −1)
= 1+2c −2 = 2c −1. This proof is indicative of a more mathematically rigorous tool, known as a
recurrence equation that can be used to analyze the running time of a recursive algorithm.

Performing a Binary Search

Considering the running time of the binary search algorithm, a constant number of
primitive operations are executed at each recursive call of method of a binary search. Hence,
the running time is proportional to the number of recursive calls performed. The most log n+1
recursive calls are made during a binary search of a sequence having n elements, leading to
the following claim. The binary search algorithm runs in O(logn) time for a sorted sequence
with n elements.

Justification:

Each recursive call the number of candidate entries still to be searched is given by the
value high−low+1. Moreover, the number of remaining candidates is reduced by at least one
half with each recursive call. Specifically, from the definition of mid, the number of remaining
candidates is either.

Initially, the number of candidates is n; after the first call in a binary search, it is at most
n/2; after the second call, it is at most n/4; and so on. In general, after the j th call in a binary
search, the number of candidate entries remaining is at most n/2j . In the worst case (an
unsuccessful search), the recursive calls stop when there are no more candidate entries.
Hence, the maximum number of recursive calls performed, is the smallest integer r such that
n 2r < 1. In other words (recalling that we omit a logarithm’s base when it is 2), r > logn. Thus,
we have r = logn+1, which implies that binary search runs in O(logn) time

Computing Disk Space Usage

To characterize the “problem size” for our analysis, we let n denote the number of file-
system entries in the portion of the file system that is considered. (For example, the file system
portrayed in Figure 4.6 has n = 19 entries.) To characterize the cumulative time spent for an

31
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

initial call to the disk usage function, we must analyze the total number of recursive invocations
that are made, as well as the number of operations that are executed within those invocations.

Intuitively, a call to disk usage for a particular entry ‘e’ of the file system is only made
from ‘e’, and that entry will only be explored once.

The fact that each iteration of that loop makes a recursive call to disk usage, and yet
we have already concluded that there are a total of n calls to disk usage (including the original
call). We therefore conclude that there are O(n) recursive calls, each of which uses O(1) time
outside the loop, and that the overall number of operations due to the loop is O(n). Summing
all of these bounds, the overall number of operations is O(n).

32
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Unit I – 2 Mark Questions with Answers

1. What are Variables?

Variables are placeholders for representing data. In computer programming, variables


are used to hold data.
Ex: x2+2y-2=1

2. What are Data Types?

A data type in a programming language is a set of data with predefined values.


Examples of data types are: integer-2 Bytes, floating point-4 Bytes, unit number, character,
string, etc.

At the top level, there are two types of data types:


• System-defined data types (also called Primitive data types)
• User-defined data types

3. What are System-defined data types (Or Primitive data types)?


Data types that are defined by system are called primitive data types. The primitive
data types provided by many programming languages are: int, float, char, double, bool, etc.

The number of bits allocated for each primitive data type depends on the programming
languages, the compiler and the operating system.

For the same primitive data type, different languages may use different sizes.
Depending on the size of the data types, the total available values (domain) will also change.

For example, “int” may take 2 bytes or 4 bytes. If it takes 2 bytes (16 bits), then the
total possible values are minus 32,768 to plus 32,767 (-215 to 215-1). If it takes 4 bytes (32
bits), then the possible values are between -2,147,483,648 and +2,147,483,647 (-231 to 231-
1). The same is the case with other data types.

4. What are User defined data types?


If the system-defined data types are not enough, then most programming languages
allow the users to define their own data types, called user – defined data types.
Good examples of user defined data types are: structures in C/C + + and classes in
Java. For example, in the snippet below, we are combining many system-defined data types
and calling the user defined data type by the name “newType”.
This gives more flexibility and comfort in dealing with computer memory.
struct newType
{
int data1;
float data2;
.
.
.
char data-n;
};

33
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

5. What are Data Structures?


Data structure is a particular way of storing and organizing data in a computer so that
it can be used efficiently.

A data structure is a special format for organizing and storing data. General data
structure types include arrays, files, linked lists, stacks, queues, trees, graphs and so on.

Depending on the organization of the elements, data structures are classified into two types:

1) Linear data structures: Elements are accessed in a sequential order but it is not
compulsory to store all elements sequentially. Examples: Linked Lists, Stacks and Queues.

2) Non – linear data structures: Elements of this data structure are stored/accessed in
a non-linear order. Examples: Trees and graphs.

6. What are Abstract Data Types? Or What is ADT?


An abstract Data type (ADT) is defined as a mathematical model with a collection of
operations defined on that model. Set of integers, together with the operations of union,
intersection and set difference form a example of an ADT. An ADT consists of data together
with functions that operate on that data.
Advantages/Benefits of ADT:
1.Modularity
2. Reuse
3. code is easier to understand
4. Implementation of ADTs can be changed without requiring changes to the program
that uses the ADTs.

7. Characteristics of Python?
Robustness
Adaptable
Reusable
Modular

8. Why Python is Robustness?


A program produces the right output for all the anticipated inputs in the program’s
application. In addition, we want software to be robust, that is, capable of handling unexpected
inputs that are not explicitly defined for its application.

For example, if a program is expecting a positive integer and instead is given a


negative integer, then the program should be able to recover gracefully from this error.

34
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

9. Is Python Adaptable?
Software, needs to be able to evolve over time in response to changing conditions in
its environment. Thus, another important goal of quality software is that it achieves adaptability
(also called evolvability).

Related to this concept is portability, which is the ability of software to run with minimal
change on different hardware and operating system platforms. An advantage of writing
software in Python is the portability provided by the language itself.

10. What is Reusability?


Developing quality software can be an expensive enterprise, and its cost can be offset
somewhat if the software is designed in a way that makes it easily reusable in future
applications.

Such reuse should be done with care, however, for one of the major sources of
software errors in the Therac-25 came from inappropriate reuse of Therac-20 software.

11. What are Object-Oriented Design Principles?


Chief principles of object-oriented approach is,
 Modularity
 Abstraction
 Encapsulation

12. What is Modularity?


Modularity refers to an organizing principle in which different components of a software
system are divided into separate functional units.

Using modularity in a software system can also provide a powerful organizing


framework that brings clarity to an implementation.

13. What is Abstract base class?

Python supports abstract data types using a mechanism known as an abstract base
class (ABC). An abstract base class cannot be instantiated (i.e., you cannot directly create an
instance of that class), but it defines one or more common methods that all implementations
of the abstraction must have.

An ABC is realized by one or more concrete classes that inherit from the abstract base
class while providing implementations for those method declared by the ABC.

35
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

14. What is Encapsulation?

Wrapping up on Class and Data together into single unit is called Encapsulation.
Different components of a software system should not reveal the internal details of their
respective implementations.

Encapsulation yields robustness and adaptability, for it allows to fix bugs or add new
functionality with relatively local changes to a component.

15. Define Class.

A class is a collection of objects. A class contains the blueprints or the prototype from
which the objects are being created. It is a logical entity that contains some attributes and
methods.

A class provides a set of behaviors in the form of member functions (also known as
methods), with implementations that are common to all instances of that class.

A class determines the way that state information for each instance, it is represented
in the form of attributes (also known as fields, instance variables, or data members)

Example:
class Person:
init (self,a,b):
Print(“Sum=”,a+b)
Obj=Person(2,3);

Output:
Sum=5

16. Define Constructor.

init is a Special method that serves as the constructor of the class. Its primary
responsibility is to establish the state of a newly created credit card object with appropriate
instance variables.

Example:
class Person:
init (self,a,b):
Print(“Sum=”,a+b)
Obj=Person(2,3);

Output:
Sum=5

36
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

17. What is Inheritance?

In object-oriented programming, the mechanism for a modular and hierarchical


organization is a technique known as inheritance.

This allows a new class to be defined based upon an existing class as the starting
point. In object-oriented terminology, the existing class is typically described as the base class,
parent class, or superclass, while the newly defined class is known as the subclass or child
class.

Syntax:
class BaseClass:
Body of base class

class DerivedClass(BaseClass):
Body of derived class

Derived class inherits features from the base class where new features can be added to
it. This results in re-usability of code.

18. List the types of Inheritance.


Depending upon the number of child and parent classes involved, there are four types
of inheritance in python.

19. Give example for Single Inheritance.


When a child class inherits only a single parent class.
Example:
class Parent:
def func1(self):
print("this is function one")
class Child(Parent):
def func2(self):
print(" this is function 2 ")
ob = Child()
ob.func1()
ob.func2()

37
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

20. What is Multiple Inheritance?


When a child class inherits from more than one parent class.
Example:
class Parent:
def func1(self):
print("this is function 1")
class Parent2:
def func2(self):
print("this is function 2")
class Child(Parent , Parent2):
def func3(self):
print("this is function 3")
ob = Child()
ob.func1()
ob.func2()
ob.func3()

21. What is Multilevel Inheritance?


When a child class becomes a parent class for another child class.
Example:
class Parent:
def func1(self):
print("this is function 1")
class Child(Parent):
def func2(self):
print("this is function 2")
class Child2(Child):
def func3("this is function 3")
ob = Child2()
ob.func1()
ob.func2()
ob.func3()

38
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

22. What is Hierarchical Inheritance?


Hierarchical inheritance involves multiple inheritance from the same base or parent
class.
Example:
class Parent:
def func1(self):
print("this is function one")
class Child(Parent):
def func2(self):
print("this is function 2")
class Child1(Parent):
def func3(self):
print(" this is function 3"):
class Child3(Parent , Child1):
def func4(self):
print(" this is function 4")
ob = Child3()
ob.func1()

23. What is Scope?


Whenever an identifier is assigned to a value that definition is made with a specific
scope. Top-level assignments are typically made in what is known as global scope.
Assignments made within the body of a function typically have scope that is local to that
function call.

24. What are Namespaces?


Each distinct scope in Python is represented using an abstraction known as a
namespace. A namespace manages all identifiers that are currently defined in a given scope.
The process of determining the value associated with an identifier is known as name
resolution.
In a Python program, there are three types of namespaces:

1. Built-In
2. Global
3. Local

These have differing lifetimes. As Python executes a program, it creates namespaces as


necessary and deletes them when they’re no longer needed.

39
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

25. What is Python Global Namespace?

This is the namespace that holds all the global objects. This namespace gets created
when the program starts running and exists till the end of the execution.

Example of Global Namespace in Python:


text="Python"
def func():
print(text)
func()
print(text)
Output:
Python
Python
26. What are Built-in Namespaces in Python?

This namespace gets created when the interpreter starts. It stores all the keywords or
the built-in names. This is the superset of all the Namespaces. This is the reason we can use
print, True, etc. from any part of the code.

27. What is Python Local Namespace

This is the namespace that generally exists for some part of the time during the
execution of the program. This stores the names of those objects in a function.

These namespaces exist as long as the functions exist. This is the reason we cannot
globally access a variable, created inside a function.

Example of local namespace:


var1="Python"
def func():
var2="Python"
print(var2)
func()
print(var1)
print(var2) //Error

Output:
NameError: name 'var2' is not defined

28. How to Copy an Object in Python?

In Python, we use = operator to create a copy of an object. It only creates a new variable
that shares the reference of the original object.
In Python, there are two other ways to create copies:
o Shallow Copy
o Deep Copy
To make these copy work, copy module is used.

40
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

For example:
import copy
copy.copy(x)
copy.deepcopy(x)

29. Give example for Python Copy methods.


Here, the copy() return a shallow copy of x. Similarly, deepcopy() return a deep copy
of x.

30. How will you create a reference in python?


A shallow copy creates a new object which stores the reference of the original
elements. So, a shallow copy doesn't create a copy of nested objects, instead it just copies
the reference of nested objects. This means, a copy process does not recurse or create copies
of nested objects itself.
Example: Create a copy using shallow copy
import copy The output will be:
old_list = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
ld list: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
new_list = copy.copy(old_list)
ew list: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
print("Old list:", old_list)
print("New list:", new_list)

31. Give the procedure to create a new object from original elements.
A deep copy creates a new object and recursively adds the copies of nested objects
present in the original elements.

Example: Copying a list using deepcopy()


import copy
old_list = [[1, 1, 1], [2, 2, 2], [3, 3, 3]] Output:
Old list: [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
new_list = copy.deepcopy(old_list)
New list: [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
print("Old list:", old_list)
print("New list:", new_list)

41
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

32. What is the use of Algorithm analysis?


Algorithm analysis helps us to determine which algorithm is most efficient in terms of
time and space consumed.

The running time of an algorithm can be calculated by executing it on various test


inputs and recording the time spent during each execution.

from time import time


start time = time( ) # record the starting time
run algorithm
end time = time( ) # record the ending time
elapsed = end time − start time # compute the elapsed time

33. What are the Challenges of Experimental Analysis?


 Experimental running times of two algorithms are difficult to directly compare unless
the experiments are performed in the same hardware and software environments.
 Experiments can be done only on a limited set of test inputs; hence, they leave out the
running times of inputs not included in the experiment (and these inputs may be
important).
 An algorithm must be fully implemented in order to execute it to study its running time
experimentally.

34. What is runtime analysis of algorithms?


Runtime Analysis of Algorithms In general cases, we mainly used to measure and
compare the worst-case theoretical running time complexities of algorithms for the
performance analysis. The fastest possible running time for any algorithm is O (1), commonly
referred to as Constant Running Time.

35. How to analyse the complexity of user input to an algorithm?


Algorithm analysis depends on which inputs the algorithm takes less time (performing
well) and with which inputs the algorithm takes a long time.

Worst case
 Defines the input for which the algorithm takes a long time (slowest time to
complete).
 Input is the one for which the algorithm runs the slowest.
Best case
 Defines the input for which the algorithm takes the least time (fastest time to
complete).
 Input is the one for which the algorithm runs the fastest.
Average case
 Provides a prediction about the running time of the algorithm.
 Run the algorithm many times, using many different inputs and divide by the
number of trials.
 Assumes that the input is random.

Lower Bound <= Average Time <= Upper Bound

42
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

36. What is Asymptotic Notation?

An Asymptotic Notations is the notation which represent the complexity of an


algorithm. It is used to study how the running time of an algorithm grows as the value of the
input or the unknown variable increases. Therefore, it is also known as the "growth rate" of an
algorithm.

Big-O Notation – Upper Bound, represented as f(n) = O(g(n)).


Example-1 Find upper bound for f(n) = 3n + 8
Solution: 3n + 8 ≤ 4n, for all n ≥ 8
∴ 3n + 8 = O(n) with c = 4 and n0 = 8
Big Omega Notation - Lower Bound, represented as f(n) = Ω(g(n)).
Example : 3nlog n−2n is Ω(nlog n).
Solution: 3nlog n− 2n = nlog n+ 2n(logn− 1) ≥ nlogn for n ≥ 2; hence, we can take c =1
and n0 = 2 in this case.
Big-Theta Notation - says that two functions growing at the same rate, up to constant factors.
Example: 3nlog n+4n+5logn is Θ(nlog n).
Solution: 3nlogn ≤ 3nlog n+4n+5logn ≤ (3+4+5) nlogn for n≥2.

37. Give example for Asymptotic Analysis.

There are some general rules to help us determine the running time of an algorithm.

1) Loops: The running time of a loop is, at most, the running time of the statements inside the
loop (including tests) multiplied by the number of iterations.

Example: // Executes n times


For(i=1;i<=n;i++)
M=m+2 //constant time, c

Total time = a constant c × n = c n = O(n).

38. How will you analyse the running time complexity of Nested Loops?

Nested loops: Analyze from the inside out. Total running time is the product of the sizes
of all the loops.

Example:
//outer loop
For(i=1;i<=n;i++)
For(j=1;j<=n;j++)
K=k+1 //constant time, c
Total time = c × n × n = c n2 = O(n2 ).

43
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

39. How will you analyse the running time complexity of Consecutive Statements?

Consecutive statements: Add the time complexities of each statement


Example:
X=x+1
For(i=1;i<=n;i++)
M=m+2 //constant time, c
//outer loop
For(i=1;i<=n;i++)
For(j=1;j<=n;j++)
K=k+1 //constant time, c

Total time = c0 + c1n + c2n2 = O(n2 ).

40. How will you analyse the running time complexity of if-then-else statements?

If-then-else statements - Worst-case running time: the test, plus either the then part or
the else part (whichever is the larger).

//test : constant
If(length()==0)
Return false;
Else:
For(int n=0;n<length();n++)
If(!list[n].equals(otherList.list[n]))
Return false;
Total time = c0 + c1 + (c2 + c3 ) * n = O(n).

41. What is Recursion?

Recursion is a technique by which a function makes one or more calls to itself during
execution, until the condition gets satisfied. Recursion provides a powerful alternative for
performing repetitive tasks.

Example – Finding Factorial of a given number:

A Recursive Implementation of the Factorial Function

def factorial(n):
if n == 0:
return 1
else:
return n*factorial(n−1)

44
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

42. Give the procedure for drawing an English Ruler using Recursion.

 This is to draw the markings of a typical English ruler.


 For each inch, we place a tick with a numeric label.
 We denote the length of the tick designating a whole inch as the major tick
length.
 Between the marks for whole inches, the ruler contains a series of minor ticks,
placed at intervals of 1/2 inch, 1/4 inch, and so on.

Trace of the English Ruler Code:

43. Give procedure to find disk space using Recursion.


Modern operating systems define file-system directories (which are also sometimes
called “folders”) in a recursive way. Namely, a file system consists of a top-level directory, and
the contents of this directory consists of files and other directories, which in turn can contain
files and other directories, and so on.

45
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

UNIT II LINEAR STRUCTURES

List ADT – array-based implementations – linked list implementations – singly linked


lists – circularly linked lists – doubly linked lists – applications of lists – Stack ADT – Queue
ADT – double ended queues

List ADT:

List ADT can be implemented using Array or Python List.

Array Implementation:

Basic structure for storing and accessing a collection of data is the array. A one-
dimensional array is a collection of contiguous elements in which individual elements are
identified by a unique integer subscript starting with zero. Once an array is created, its size
cannot be changed.

 Array(size): Creates a one-dimensional array consisting of size elements with


each element initially set to None. size must be greater than zero.
 length (): Returns the length or number of elements in the array.
 getitem (index): Returns the value stored in the array at element position
index. The index argument must be within the valid range. Accessed using the
subscript operator.
 setitem (index, value): Modifies the contents of the array element at position
index to contain value. The index must be within the valid range. Accessed
using the subscript operator.
 clearing(value): Clears the array by setting every element to value.
 iterator (): Creates and returns an iterator that can be used to traverse the
elements of the array.

Python List Implementation

Python’s list structure is a mutable sequence container that can change size as items
are added or removed. It is an abstract data type that is implemented using an array structure
to store the items contained in the list.

46
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Appending Items

pyList.append( 50 )

If there is room in the array, the item is stored in the next available slot of the array and
the length field is incremented by one.

pyList.append( 18 )

pyList.append( 64 )

pyList.append( 6)

After the second statement is executed, the array becomes full and there is no
available space to add more values.

By definition, a list can contain any number of items and never becomes full. Thus,
when the third statement is executed, the array will have to be expanded to make room for
value 6. Array cannot change size once it has been created. To allow for the expansion of the
list, the following steps have to be performed:

(1) A new array is created with additional capacity,

(2) The items from the original array are copied to the new array,

(3) The new larger array is set as the data structure for the list,

(4) The original smaller array is destroyed. After the array has been expanded, the
value can be appended to the end of the list.

47
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Extending a List

A list can be appended to a second list using the extend() method as shown in the
following example:

pyListA = [ 34, 12 ]

pyListB = [ 4, 6, 31, 9 ]

pyListA.extend( pyListB )

If the list being extended has the capacity to store all of the elements from the second
list, the elements are simply copied, element by element. If there is not enough capacity for all
of the elements, the underlying array has to be expanded as was done with the append()
method.

48
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Inserting Items

An item can be inserted anywhere within the list using the insert() method. In the
following example pyList.insert( 3, 79 ) we insert the value 79 at index position 3. Since there
is already an item at that position, we must make room for the new item by shifting all of the
items down one position starting with the item at index position 3. After shifting the items, the
value 79 is then inserted at position 3.

Removing Items

An item can be removed from any position within the list using the pop() method.
Consider the following code segment, which removes both the first and last items from the
sample list:

pyList.pop( 0 ) # remove the first item

pyList.pop() # remove the last item

The first statement removes the first item from the list. After the item is removed,
typically by setting the reference variable to None, the items following it within the array are
shifted down, from left to right, to close the gap. Finally, the length of the list is decremented
to reflect the smaller size.

The second pop() operation in the example code removes the last item from the list.
Since there are no items following the last one, the only operations required are to remove the
item and decrement the size of the list. After removing an item from the list, the size of the
array may be reduced using a technique similar to that for expansion. This reduction occurs
when the number of available slots in the internal array falls below a certain threshold. For

49
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

example, when more than half of the array elements are empty, the size of the array may be
cut in half.

List Slice
Slicing is an operation that creates a new list consisting of a contiguous subset of
elements from the original list. The original list is not modified by this operation. Instead,
references to the corresponding elements are copied and stored in the new list. In Python,
slicing is performed on a list using the colon operator and specifying the beginning element
index and the number of elements included in the subset. Consider the following example
code segment, which creates a slice from our sample list: aSlice = theVector[2:3]

Python Code
import ctypes
class dy_array:
def init (self):
self.n=0
self.capacity=1
self.Arr=self.makearray(self.capacity)

def makearray(self,c):
return (c*ctypes.py_object)( )

def findlength(self,obj):
for i in obj:
pass

50
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

return (i)

def getitem(self,x):
for i in range(len):
if(self.Arr[i]==x):
return i
else:
print("Data Not Found")

def append(self,obj):
if(self.n==self.capacity):
self.resize(2*self.capacity)
self.Arr[self.n]=obj
self.n+=1

def resize(self,c):
B=self.makearray(2*self.capacity)
for i in range(self.n):
B[i]=self.Arr[i]
self.Arr=B
self.capacity=c

def insert(self,pos,val):
if(self.n==self.capacity):
self.resize(2*self.capacity)
for i in range(self.n,pos,-1):
self.Arr[i]=self.Arr[i-1]
self.Arr[pos]=val
self.n+=1

def extend(self,val):
len=self.findlength(val)
print("..",len)
for i in range(len):
self.append(val[i])

def remove(self,val):
for i in range(self.n):
if self.Arr[i]==val:
for j in range(i,self.n-1):
self.Arr[j]=self.Arr[j+1]
self.n-=1

def disp(self):
for i in range(self.n):
print(self.Arr[i])

51
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Linked List Implementation:

Linked list, which provides an alternative to an array-based sequence (such as a


Python list). An array provides the more centralized representation, with one large chunk of
memory capable of accommodating references to many elements.
A linked list, relies on a more distributed representation in which a lightweight object,
known as a node, is allocated for each element. Each node maintains a reference to its
element and one or more references to neighbouring nodes in order to collectively represent
the linear order of the sequence.

Singly Linked Lists


A singly linked list, is a collection of nodes that collectively form a linear sequence.
Each node stores a reference to an object that is an element of the sequence and a
reference to the next node of the list.

Singly Linked List Node Representation

Singly Linked List Representation

Inserting an Element at the Head of a Singly Linked List


1. Create new node instance storing reference to element e
2. Set new node’s next to reference the old head node
3. Set variable head to reference the new node
4. Increment the node count.

52
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

After Insertion

Algorithm add first(L,e):


newest = Node(e) #{create new node instance storing reference to element e}
newest.next = L.head #{set new node’s next to reference the old head node}
L.head = newest #{set variable head to reference the new node}
L.size = L.size+1 #{increment the node count}

Inserting an Element at the Tail of a Singly Linked List

Algorithm add last(L,e):


newest = Node(e) {create new node instance storing reference to element
e}
newest.next = None {set new node’s next to reference the None object}
L.tail.next = newest {make old tail node point to new node}
L.tail = newest {set variable tail to reference the new node}
L.size = L.size+1 {increment the node count}

Removing First Element from a Singly Linked List

53
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Algorithm remove first(L):


if L.head is None then
Indicate an error: the list is empty.
L.head = L.head.next {make head point to next node (or None)}
L.size = L.size−1 {decrement the node count}

Python Code for Singly Linked List:


class L:
class Node:
def init (self,data):
self.data=data
self.next=None

def init (self):


self.head=None
self.tail=None
self.size=0

def len(self):
return self.size

def insert_first(self,data):
newnode=L.Node(data)
newnode.next=self.head
self.head=newnode
if self.tail==None:
self.tail=newnode
self.size+=1

def insert_last(self,data):
newnode=L.Node(data)
if self.tail.data==None:
self.head=self.tail=newnode
else:
self.tail.next=newnode;
self.tail=newnode
self.size+=1

def remove_first(self):
if self.head==None:
print("Invalid")
else:
self.head=self.head.next
self.size-=1
def display(self):
n=L.Node(None);
n=self.head
for i in range(self.size):
print(n.data)
n=n.next
def length(self):
return self.size

54
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Circularly Linked List:

A circularly linked list, is a collection of nodes that collectively form a linear sequence
and the next of tail node point back to the head of the list.
A circularly linked list provides a more general model than a standard linked list for
data sets that are cyclic, that is, which do not have any particular notion of a beginning and
end.

Singly circular linked list


A1 A2 A3

Doubly circular linked list

A A
A1
2 3

Node Creation & Initialization:


class Node:
def init (self,data):
self.data=data
self.next=None

def init (self):


self.head=None
self.tail=None
self.size=0

Insert at first:

def insert_first(self,data):
newnode=L.Node(data)
newnode.next=self.head
if self.head==None:
self.head=newnode
self.tail=newnode
self.tail.next=self.head
self.size+=1

55
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Insert at Last:

def insert_last(self,data):
newnode=L.Node(data)
if self.tail.data==None:
self.head=self.tail=newnode
else:
self.tail.next=newnode;
self.tail=newnode
self.tail.next=self.head
self.size+=1

Remove First:
def remove_first(self):
if self.head==None:
print("Invalid")
else:
self.tail.next=self.head.next
self.head=self.head.next
self.size-=1

Displaying all the elements of the List:


def display(self):
n=L.Node(None);
n=self.head
for i in range(self.size):
print(n.data)
n=n.next

Finding Length of the List:

def length(self):
return self.size

56
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Doubly Linked Lists

A linked list in which each node keeps an explicit reference to the node before it
and a reference to the node after it is known as a doubly linked list.

Doubly Linked List Structure

Advantages:
 Deletion operation is easier.
 Finding the predecessor and successor of node is easier.

Doubly Linked List Node:

Newnode

Insertion in Doubly Linked List:

Algorithm
1. Create a newnode
2. If there is no list already, make newnode as Head and Tail.
3. else Find the node predata.
4. Update,
Newnode’s next = predata.next
predate.next.prev = newnode.
newnode.prev = predata.
predata.next = newnode.
5. Increase the size.

57
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Deletion in Doubly Linked List

Algorithm:

1. Before deletion, check, if list is empty, print (“List is empty”)


2. else using temp variable, find the data to be deleted.
3. If found, Update,
temp.prev.next=temp.next
temp.next.prev=temp.prev
4. Decrease the size

Python code for Doubly Linked List:


class List:
class Node:
def init (self,data):
self.data=data
self.next=None
self.prev=None

def init (self):


self.head=None
self.tail=None
self.size=0

def len(self):
return self.size
def insert(self,predata,data):
newnode=List.Node(data)
if(self.head==None):
self.head=newnode
self.tail=newnode
self.size+=1
else:
temp=self.head
while(temp!=None):
if(temp.data==predata):
newnode.next=temp.next
temp.next.prev=newnode
temp.next=newnode

58
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

newnode.prev=temp
self.size+=1
break
temp=temp.next
else:
print("data not found")

def remove(self,x):
print("size:",self.size)
temp=self.head
if(self.head==None):
print("Empty List")
return
elif(self.head.data==x and self.size==1):
self.head=None
self.tail=None
self.size-=1
print("First node deleted")
return
elif(self.head.data==x):
self.head=self.head.next
self.head.prev=None
self.size-=1
return
else:
while(temp.data!=x):
temp=temp.next
else:
print("Data not found")
return
temp.prev.next=temp.next
temp.next.prev=temp.prev
self.size-=1
return

def display(self):
if self.head==None:
print("List empty")
else:
n=L.Node(None)
n=self.head
for i in range(self.size):
print(n.data)
n=n.next

59
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Stack ADT

A stack is an ordered list in which all insertions and deletions are made at one end,
called the top.

Stack is a list with the restriction that insertions and deletions can be performed in
only one position, namely the end of the list called Top.

It follows LIFO approach. LIFO represents “Last In First Out”. The basic
operations are push and pop.

PushEquivalent to insert.

Pop Equivalent to delete. It deletes the most recently inserted element.

Stack Model:

Pop Push(x)
Stack

Pop
Push
Top/Tos 3
10
6
4
5

The Basic Operations performed in the stack are:


o PUSH()
o POP()
o IsEmpty()
o IsFull()

Primitive operations on the stack Running Time Complexity


 To create a stack
 To insert an element on to the stack.
 To delete an element from the stack.
 To check which element is at the top of the stack.
 To check whether a stack is empty or not.
 To check whether the stack is full or not
 To find the length of the stack

60
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Implementation of Stack:
There are two methods of implementing stack operations.
 Array implementation
 Linked List implementation
Push Operation:
The process of putting a new data element onto stack is known as a Push
Operation.
Push operation involves a series of steps –
Step 1 − Checks if the stack is full.
Step 2 − If the stack is full, produces an error and exit.
Step 3 − If the stack is not full, increments top to point next empty space.
Step 4 − Adds data element to the stack location, where top is pointing.

Code for Push Operation:


def push(self,data):
newnode=Stack.node(data)
if(self.top==None):
self.top=newnode
else:
newnode.next=self.top
self.top=newnode
self.size+=1

Pop Operation:
POP operation is performed on the stack to remove items from the stack
Pop operation involves a series of steps –
Step 1 - Check if top== (-1) then stack is empty else goto step 4
Step 2 - Access the element top is pointing num = stk[top];
Step 3 - Decrease the top by 1 top = top-1;

Code for Pop Operation:


def pop(self):
if(self.isempty()):
print("Stack is empty")
else:
self.top=self.top.next
self.size-=1

isFull():
To check whether the stack is full or not before every push operation.

Code for isFull:


def isFull(self):
return(self.size==MaxSize)

isEmpty():
To check whether the stack is Empty or not before every pop operation.

Code for isEmpty:


def isempty(self):
return(self.size==0)

61
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Linked List implementation of a Stack:


class Stack:
class node:
def init (self,data):
self.data=data
self.next=None

def init (self):


self.top=None
self.size=0

def push(self,data):
newnode=Stack.node(data)
if(self.top==None):
self.top=newnode
else:
newnode.next=self.top
self.top=newnode
self.size+=1

def pop(self):
if(self.isempty()):
print("Stack is empty")
else:
self.top=self.top.next
self.size-=1

def isFull():
return(self.size==MaxSize)

def isempty(self):
return(self.size==0)

def length(self):
return(self.size)

def display(self):
temp=self.top
for i in range(self.size):
print(temp.data)
temp=temp.next

62
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Queue ADT
Queue is an ordered collection of data items. It delete item at front of the queue. It
inserts item at rear of the queue. It has FIFO structure i.e. “First In First Out”.

Queue Model:

Dequeue (Q) Enqueue (Q)


QUEUE

The basic operations are,


Enqueue which inserts an element at the end of the list called rear end.
Dequeue Which deletes an element at the other end (front) of the list called
front end.
Types of Queue:
 Simple Queue
 Circular Queue
 Double Ended Queue
 Priority Queue

Implementation of Simple Queue(Queue):


Like stack, queue can also be implemented in two methods.
 Using Array
 Using Linked list

Queue Operations:
1. Enqueue:
To add an item to the queue. If the queue is full, then it is said to be an Overflow
condition.
Code for Enqueue:
def Enqueue(self,data):
newnode=Queue.Node(data)
if(self.size==0):
self.Front=self.Rear=newnode
else:
self.Rear.next=newnode
self.Rear=newnode
self.size+=1

2. Dequeue:
Dequeue: Removes an item from the queue. The items are popped in the same
order in which they are pushed. If the queue is empty, then it is said to be an Underflow
condition.
Code for Dequeue:
def Dequeue(self):
if(self.size==0):
print("Queue is Empty")

63
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

else:
self.Front=self.Front.next
self.size-=1

3. isFull:
To check whether the queue is full before every Enqueue Operation.
Code for isFull:
def isFull(self):
return(self.size==MaxSize)

Linked List Implementation of a Queue:


class Queue:
class Node:
def init (self,data):
self.data=data
self.next=None

def init (self):


self.Front=None
self.Rear=None
self.size=0

def Enqueue(self,data):
newnode=Queue.Node(data)
if(self.size==0):
self.Front=self.Rear=newnode
else:
self.Rear.next=newnode
self.Rear=newnode
self.size+=1

def Dequeue(self):
if(self.size==0):
print("Queue is Empty")
else:
self.Front=self.Front.next
self.size-=1

def isempty(self):
return(self.size==0)

def length(self):
return(self.size)

64
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Double Ended Queue – Deque:

Data structure that supports insertion and deletion at both the front and the back of the
queue is called a double ended queue, or deque.

Deque D supports the following methods:


D.add_first(e) : Add element e to the front of deque D.
D.add_last(e) : Add element e to the back of deque D.
D.delete_first( ): Remove and return the first element from deque D;
an error occurs if the deque is empty.
D.delete_last( ): Remove and return the last element from deque D;
an error occurs if the deque is empty.

Additionally, the deque ADT will include the following accessors:


D.is empty( ) : Return True if deque D does not contain any elements.
len(D) : Return the number of elements in deque D; in Python,
we implement this with the special method len .

Deque Implementation with Doubly Linked List:

class Deque:
class Node:
def init (self,data):
self.data=data
self.prev=None
self.next=None
def init (self):
self.Front=None
self.Rear=None
self.size=0

def Enqueue_Front(self,data):
newnode=Deque.Node(data)
if(self.isempty()):
self.Front=self.Rear=newnode
else:
newnode.next=self.Front
self.Front.prev=newnode
self.Front=newnode
self.size+=1

def Enqueue_Rear(self,data):
newnode=Deque.Node(data)
if(self.isempty()):

65
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

self.Front=self.Rear=newnode
else:
self.Rear.next=newnode
newnode.prev=self.Rear
self.Rear=newnode
self.size+=1

def Dequeue_Front(self):
if(self.isempty()):
print("Deque is Empty")
else:
self.Front=self.Front.next
self.size-=1

def Dequeue_Rear(self):
if(self.isempty()):
print("Deque is Empty")
else:
self.Rear=self.Rear.prev
self.Rear.next=None
self.size-=1

def isempty(self):
return(self.size==0)

def display(self):
temp=self.Front
for i in range(self.size):
print(temp.data)
temp=temp.next

66
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

UNIT II – 2 Marks Questions with Answers

1. Define data structure.


The data structure can be defined as the collection of elements and all the possible
operations which are required for those set of elements.
It is a way of organizing data that considers not only the items stored but also their
relationship to each other.
Ex: Array, Linked List, Stack, Queue etc.

2. What do you mean by non-linear data structure? Give example.


The non-linear data structure is the kind of data structure in which the data may be
arranged in hierarchical fashion.
For example- Trees and graphs.

3. What do you mean linear data structure? Give example.


The linear data structure is the kind of data structure in which the data is linearly
arranged.
For example- stacks, queues, linked list.

4. List the various operations that can be performed on data structure.


Various operations that can be performed on the data structure are
• Create
• Insertion of element
• Deletion of element
• Searching for the desired element
• Sorting the elements in the data structure
• Reversing the list of elements.

5. What is abstract data type? What are all not concerned in an ADT?
The abstract data type is a triple of D i.e. set of axioms, F-set of functions and A-
Axioms in which only what is to be done is mentioned but how is to be done is not mentioned.
Thus ADT is not concerned with implementation details.

6. List out the areas in which data structures are applied extensively.
Following are the areas in which data structures are applied extensively.
 Operating system - the data structures like priority queues are used for scheduling
the jobs in the operating system.
 Compiler design - the tree data structure is used in parsing the source program.
Stack data structure is used in handling recursive calls.

67
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

 Database management system - The file data structure is used in database


management systems. Sorting and searching techniques can be applied on these
data in the file.
 Numerical analysis package - the array is used to perform the numerical analysis
on the given set of data.
 Graphics - the array and the linked list are useful in graphics applications.
 Artificial intelligence - the graph and trees are used for the applications like
building expression trees, game playing.

7. What is a linked list?


A singly linked list, is a collection of nodes that collectively form a linear sequence. It
is a set of nodes where each node has two fields ‘data’ and ‘link’. The data field is used to
store actual piece of information and link field is used to store address of next node.

Singly Linked List Node Representation

Singly Linked List Representation

8. What are the pitfall encountered in singly linked list?


Following are the pitfall encountered in singly linked list
 The singly linked list has only forward pointer and no backward link is provided. Hence
the traversing of the list is possible only in one direction. Backward traversing is not
possible.
 Insertion and deletion operations are less efficient because for inserting the element
at desired position the list needs to be traversed. Similarly, traversing of the list is
required for locating the element which needs to be deleted.

9. Define doubly linked list.


Doubly linked list is a kind of linked list in which each node has two link fields. One link
field stores the address of previous node and the other link field stores the address of the next
node.

Doubly Linked List Structure

68
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

10. Write down the steps to modify a node in linked lists.


 Enter the position of the node which is to be modified.
 Enter the new value for the node to be modified.
 Search the corresponding node in the linked list.
 Replace the original value of that node by a new value.
 Display the messages as “The node is modified”.

11. Difference between arrays and lists.


In arrays any element can be accessed randomly with the help of index of array,
whereas in lists any element can be accessed by sequential access only.
Insertion and deletion of data is difficult in arrays on the other hand insertion and
deletion of data is easy in lists.

12. State the properties of LIST abstract data type with suitable example.
Various properties of LIST abstract data type are
 It is linear data structure in which the elements are arranged adjacent to each other.
 It allows to store single variable polynomial.
 If the LIST is implemented using dynamic memory then it is called linked list. Example
of LIST are- stacks, queues, linked list.

13. State the advantages of circular lists over doubly linked list.
In circular list the next pointer of last node points to head node, whereas in doubly
linked list each node has two pointers: one previous pointer and another is next pointer. The
main advantage of circular list over doubly linked list is that with the help of single pointer field
we can access head node quickly. Hence some amount of memory get saved because in
circular list only one pointer is reserved.

14. What are the advantages of doubly linked list over singly linked list?
The doubly linked list has two pointer fields. One field is previous link field and another
is next link field. Because of these two pointer fields we can access any node efficiently
whereas in singly linked list only one pointer field is there which stores forward pointer.

15. Why is the linked list used for polynomial arithmetic?


We can have separate coefficient and exponent fields for representing each term of
polynomial. Hence there is no limit for exponent. We can have any number as an exponent.

16. What is the advantage of linked list over arrays?


The linked list makes use of the dynamic memory allocation. Hence the user can
allocate or de allocate the memory as per his requirements. On the other hand, the array
makes use of the static memory location. Hence there are chances of wastage of the memory
or shortage of memory for allocation.

69
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

17. What is the circular linked list?


The circular linked list is a kind of linked list in which the last node is connected to the
first node or head node of the linked list.
Singly circular linked list

A2 A3
A1

18. What is the basic purpose of header of the linked list?


The header node is the very first node of the linked list. Sometimes a dummy value
such - 999 is stored in the data field of header node.

19. What is the advantage of an ADT?


 Change: the implementation of the ADT can be changed without making changes in
theclient program that uses the ADT.
 Understandability: ADT specifies what is to be done and does not specify
theimplementation details. Hence code becomes easy to understand due to ADT.
 Reusability: the ADT can be reused by some program in future.

20. What is static linked list?


State any two applications of it.
 The linked list structure which can be represented using arrays is called static linked
list.
 It is easy to implement, hence for creation of small databases, it is useful.
 The searching of any record is efficient, hence the applications in which the record
need to be searched quickly when the static linked list are used.

21. Define Stack


A Stack is an ordered list in which all insertions (Push operation) and deletion (Pop
operation) are made at one end, called the top. The topmost element is pointed by top. The
top is initialized to -1 when the stack is created that is when the stack is empty.
In a stack S = (a1,an), a1 is the bottom most element and element a is on top of element
ai-1. Stack is also referred as Last In First Out (LIFO) list.

22. What are the various Operations performed on the Stack?


The various operations that are performed on the stack are CREATE(S) – Creates S
as an empty stack. PUSH(S,X) – Adds the element X to the top of the stack. POP(S) – Deletes
the top most elements from the stack. TOP(S) – returns the value of top element from the
stack. ISEMTPTY(S) – returns true if Stack is empty else false. ISFULL(S) - returns true if
Stack is full else false.

70
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

23. Explain the usage of stack in recursive algorithm implementation?


In recursive algorithms, stack data structures is used to store the return address when
a recursive call is encountered.
Also it stores the values of all the parameters essential to the current state of the
function.
Recursion makes a program more readable.
In latest enhanced CPU systems, recursion is more efficient than iterations.

24. Define Queue.


A Queue is an ordered list in which all insertions take place at one end called the rear,
while all deletions take place at the other end called the front. Rear is initialized to -1 and front
is initialized to 0. Queue is also referred as First In First Out (FIFO) list.

25. What are the various operations performed on the Queue?


The various operations performed on the queue are CREATE(Q) – Creates Q as an
empty Queue.
Enqueue(Q,X) – Adds the element X to the Queue.
Dequeue(Q) – Deletes a element from the Queue.
ISEMTPTY(Q) – returns true if Queue is empty else false.
ISFULL(Q) - returns true if Queue is full else false.

26. How do you test for an empty Queue?


The condition for testing an empty queue is rear=front-1. In linked list implementation
of queue the condition for an empty queue is the header node link field is NULL.
bool isEmpty(Queue Q)
{
If(Q->Rear==-1)
return (true)
}

27. Define Dequeue.


Deque stands for Double ended queue. It is a linear list in which insertions and deletion
are made from either end of the queue structure.

28. Write down the function to insert an element into a queue, in which the queue is
implemented as an array.
void enqueue (int X, Queue Q)
{
if(IsFull(Q))
Error (“Full queue”);
else
{
Q->Size++;
Q->Rear = Q->Rear+1;
Q->Array[ Q->Rear ]=X;
}
}

71
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

29.Define Circular Queue.


Another representation of a queue, which prevents an excessive use of memory by
arranging elements/ nodes Q1,Q2,…Qn in a circular fashion. That is, it is the queue, which
wraps around upon reaching the end of the queue

30. List any four applications of stack.


 Parsing context free languages
 Evaluating arithmetic expressions
 Function call
 Traversing trees and graph
 Tower of Hanoi

72
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

UNIT – III SORTING AND SEARCHING

Bubble sort – selection sort – insertion sort – merge sort – quick sort – linear
search – binary search – hashing – hash functions – collision handling – load factors,
rehashing, and efficiency.

Sorting:

Sorting is used to arrange the data (collection of items) in the array/list in ascending or
descending order. Given a collection, the goal is to rearrange the elements so that they are
ordered from Smallest to Largest. Or Largest to Smallest.

1) Bubble Sort:

There is a simple, but inefficient algorithm, called bubble-sort, for sorting a list L of n
comparable elements. This algorithm scans the list n−1 times, where, in each scan, the
algorithm compares the current element with the next one and swaps them if they are out of
order.

This algorithm uses multiple passes and in each pass the first and second data items
are compared.

 If the first data item is bigger than the second, then the two items are swapped.
 Next the items in second and third position are compared and if the first one is
larger than the second, then they are swapped, otherwise no change in their order.
 This process continues for each successive pair of data items until all items are
sorted.

Algorithm:

#Bubble Sort
Function BubbleSort(A):
for i in range(len(A)):
for j in range(len(A)-1):
if(A[j]>A[j+1]):
A[j],A[j+1]=A[j+1],A[j]
Output:
print(A)
[1, 2, 3, 4, 7, 8, 9]
BubbleSort([4,2,7,3,1,8,9])

73
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Step-by-step example:
Let us take the array of numbers "6 2 5 3 9", and sort the array from lowest number to
greatest number using bubble sort.
In each step, elements written in bold are being compared. Three passes will be
required.

Time Complexity:
The efficiency of Bubble sort algorithm is independent of number of data items in the
array and its initial arrangement. If an array containing n data items, then the outer loop
executes n-1 times as the algorithm requires n-1 passes.
In the first pass, the inner loop is executed n-1 times; in the second pass, n-2 times; in
the third pass, n-3 times and so on. The total number of iterations resulting in a run time of
O(n2).
Worst Case Performance O(n2)
Best Case Performance O(n2)
Average Case Performance O(n2)
The total no. of iterations for the inner loop will be the sum of the first n - 1 integers,
which Equals resulting in a run time of O(n2).

74
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

2) Selection Sort
Selection sort algorithm is one of the simplest sorting algorithm, which sorts the
elements in an array by finding the minimum element in each pass from unsorted part and
keeps it in the beginning. .
This sorting technique improves over bubble sort by making only one exchange in each
pass. This sorting technique maintains two sub arrays, one sub array which is already sorted
and the other one which is unsorted. In each iteration the minimum element (ascending order)
is picked from unsorted array and moved to sorted sub array.

Python Code
# Selection Sort
Function SelectionSort(A):
for i in range(len(A)):
min=i
for j in range(i+1,len(A)):
if(A[min]>A[j]):
min=j
A[i],A[min]=A[min],A[i]
print(A)
SelectionSort([3,20,1,4,5,2])

Step-by-step example:

Here is an example of this sort algorithm sorting five elements:

Time Complexity:
Selection sort is not difficult to analyse compared to other sorting algorithms since none
of the loops depend on the data in the array. Selecting the lowest element requires scanning
all n elements (this takes n − 1 comparisons) and then swapping it into the first position.
Finding the next lowest element requires scanning the remaining n − 1 elements and so on,
for (n − 1) + (n − 2) + ... + 2 + 1 = n(n − 1) / 2 ∈ O(n2) comparisons. Each of these scans
requires one swap for n − 1 elements (the final element is already in place).
Worst Case Performance O(n2)
Best Case Performance O(n2)
Average Case Performance O(n2)

75
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

3) Insertion Sort:
We start with the first element in the array. One element by itself is already sorted.
Then we consider the next element in the array. If it is smaller than the first, we swap them.
Next we consider the third element in the array. We swap it leftward until it is in its
proper order with the first two elements. We then consider the fourth element, and swap it
leftward until it is in the proper order with the first three.
We continue in this manner with the fifth element, the sixth, and so on, until the whole
array is sorted.
Algorithm InsertionSort(A):
Input: An array A of n comparable elements
Output: The array A with elements rearranged in nondecreasing order
for k from 1 to n − 1 do
Insert A[k] at its proper location within A[0], A[1], ..., A[k].
Step-by-step example:

Algorithm:
Function InsertionSort(A):
for j in range(1,len(A)):
i=j
while(i>0):
if(A[i]<A[i-1]):
A[i],A[i-1]=A[i-1],A[i]
i-=1

print(A)

InsertionSort([5,14,30,2,1])

76
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Time Complexity:
Worst Case Performance O(n2)
Best Case Performance(nearly) O(n)
Average Case Performance O(n2)

4) Merge Sort:

erge sort is based on Divide and conquer method. It takes the list to be sorted
and divide it in half to create two unsorted lists. The two unsorted lists are then sorted and
merged to get a sorted list. The two unsorted lists are sorted by continually calling the Partition
algorithm; we eventually get a list of size 1 which is already sorted. The two lists of size 1 are
then merged.

This is a divide and conquer algorithm. This works as follows:

1. Divide the input which we have to sort into two parts in the middle. Call it the left
part and right part.
2. Sort each of them separately. Note that here sort does not mean to sort it using
some other method. We use the same function recursively.
3. Then merge the two sorted parts.

Step-by-step Example:

Algorithm:
Function merge(S1, S2, S):
i=j=0
while i + j < len(S):
if j == len(S2) or (i < len(S1) and S1[i] < S2[j]):
S[i+j] = S1[i]
i += 1
else:
S[i+j] = S2[j]
j += 1

77
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Function Partition(S):
n = len(S)
if n < 2:
return
mid = n // 2
S1 = S[0:mid] # copy of first half
S2 = S[mid:n] # copy of second half
Partition(S1) # sort copy of first half
Partition(S2) # sort copy of second half
merge(S1, S2, S)

A=[85,24,63,450,170,31,96,50]
Partition(A)
print("Sorted Array is")
for i in range(len(A)):
print(A[i])

The Running Time of Merge-Sort:


We begin by analyzing the running time of the merge algorithm. Let n1 and n2 be the
number of elements of S1 and S2, respectively. It is clear that the operations performed inside
each pass of the while loop take O(1) time. The key observation is that during each iteration
of the loop, one element is copied from either S1 or S2 into. Therefore, the number of iterations
of the loop is n1+n2. Thus, the running time of algorithm merge is O(n1+n2).

5) Quick Sort:
The quick sort algorithm also uses the divide and conquer strategy. But unlike the
merge sort, which splits the sequence of keys at the midpoint, the quick sort partitions the
sequence by dividing it into two segments based on a selected pivot key. In addition, the quick
sort can be implemented to work with virtual sub sequences without the need for temporary
storage.
Quick sort is a divide and conquer algorithm. Quick sort first divides a large list into
two smaller sublists: the low elements and the high elements. Quick sort can then recursively
sort the sub-lists.

The steps are:


1. Pick an element, called a pivot, from the list.
2. Reorder the list so that all elements with values less than the pivot come before the
pivot, while all elements with values greater than the pivot come after it. (equal values
can go either way).
3. After this partitioning, the pivot is in its final position. This is called the partition
operation.
4. Recursively apply the above steps to the sub-list of elements with smaller values and
separately the sub-list of elements with greater values.

78
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Step-by-step Example 1:
Pivot=last
15 6 8 4 11 9 2 1 5
Left=0 position
left right Pivot
Right=last-1 position
If left < right,
1 6 8 4 11 9 2 15 5
If left <Pivot, move left to next
left right Pivot
If right>pivot, move right to prev
1 2 8 4 11 9 6 15 5 If left < right, swap left & right
left right Pivot Left=left+1, right=right-1
If left<right,
If left <Pivot, move left to next
If right>pivot, move right to prev
1 2 8 4 11 9 6 15 5
left right Pivot
If left < right, swap left & right
Left=left+1, right=right-1

1 2 8 4 11 9 6 15 5 If right>pivot, move right to prev


left right Pivot
1 2 8 4 11 9 6 15 5 If right>pivot, move right to prev
left right Pivot
1 2 4 8 11 9 6 15 5 If left < right, swap left & right
right left Pivot Left=left+1, right=right-1
1 2 4 8 11 9 6 15 5
If left > right, swap left & pivot
right left Pivot
Lock pivot
1 2 4 5 11 9 6 15 8
(Do quicksort for left of 5 and right
left right Pivot
of 5 separately.)
1 2 4 5 11 9 6 15 8 If left >=right, swap left & pivot
left right Pivot Lock pivot

9
1 2 4 5 15 8 If left >= right, swap left & pivot
6 Left, 11
Pivot
right
Lock pivot
(Do quicksort for left of 8 and right
1 2 4 5 8 11 15 9 of 8 separately.)
6
Left right Pivot Pivot=last
Left=next position
Right=last-1 position
15 If left<right,
1 2 4 5 6 8 9 11
Left, If left <Pivot, move left to next
Pivot
right If right>pivot, move right to prev

1 2 4 5 6 8 9 15 11 If left >= right, swap left & pivot


Lock pivot
Lock all single elements
(Do quicksort for left of 5)
1 2 4
5 6 8 9 11 15 Pivot=last(4)
left Right pivot
Left=0 position
Right=last-1 position
2 4 If left<right,
1
right Pivot 5 6 8 9 11 15 If left <Pivot, move left to next
left If right>pivot, move right to prev

If left >= right, swap left & pivot


1 2 4 5 6 8 9 11 15
Lock pivot

Lock all single elements


1 2 4 5 6 8 9 11 15

79
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Example 2:

Algorithm:
Function Quicksort(S, l, r):
if l >= r:
return
pivot = S[r]
left = l
right = r-1
while left <= right:
while left <= right and S[left] < pivot:
left += 1
while left <= right and pivot < S[right]:
right-=1
if left <= right:
S[left], S[right] = S[right], S[left]
left = left + 1
right = right - 1
else:
S[left], S[pivot] = S[pivot], S[left]
Quicksort(S, l, left - 1)
Quicksort(S, left + 1, r)
A=[41,21,5,10,6,3]
Quicksort(A,0,5)
print(A)

80
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Searching:

Searching is the process of selecting particular information from a collection of


data based on specific criteria. Example: Performing web searches to locate pages containing
certain words or phrases or when looking up a phone number in the telephone book.

1) Linear Search:
The algorithm uses the guess and check pattern by first guessing that the smallest
item is the first item in the list and then checking the subsequent items to see if it made an
incorrect guess.

When the sequence is unsorted, the standard approach to search for a target
value is to use a loop to examine every element, until either finding the target or exhausting
the data set. This is known as the Linear or sequential search algorithm. This algorithm runs
in O(n) time (i.e., linear time) since every element is inspected in the worst case.
Example:
List : 10,51,2,18,4,31,13,5,23,64,29 Element to be searched : 31

Algorithm:

# Linear Search
Function LinearSearch(List,data):
for i in range(len(List)):
if(List[i]==data):
print(data,"present at position",i+1)
break
else:
print("Element not found!")

List=[10,51,2,18,4,31,13,5,23,64,29]
LinearSearch(List,31)

Output:
31 present at position 6

81
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

2) Binary Search:

Binary Search is a classic recursive algorithm used to efficiently locate a target


value within a sorted sequence of n elements. This is among the most important of computer
algorithms, and it is the reason that we so often store data in sorted order.

Values stored in sorted order within an indexable sequence, such as Python list.

The numbers at top are the indices.

Example of Binary Search for the target value 22.

When the sequence is sorted and indexable, there is a much more efficient algorithm.
Initially, low = 0 and high = n− 1. We then compare the target value to the median candidate,
that is, the item data[mid] with index mid = (low +high)/2 .

We consider three cases:


 If the target equals data[mid], then we have found the item we are looking for, and the
search terminates successfully.
 If target < data[mid], then we recur on the first half of the sequence, that is, on the
interval of indices from low to mid−1.
 If target > data[mid], then we recur on the second half of the sequence, that is, on the
interval of indices from mid+1 to high.
 An unsuccessful search occurs if low > high, as the interval [low,high] is empty.

This algorithm is known as Binary Search.

82
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Algorithm For Binary Search

Function BinarySearch(List,data,low,high):
if(low<=high):
mid=(low+high)//2
if(data==List[mid]):
print(data,"present at position",mid+1)
elif(data<List[mid]):
BinarySearch(List,data,low,mid-1)
else:
BinarySearch(List,data,mid+1,high)
else:
print("Element not found!")

List=[1,2,3,4,5,6,7,8,9,0]
BinarySearch(List,1,0,9)

Output:
1 present at position 1

83
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Hashing:
What is Hashing?
Hashing in the data structure is a technique of mapping a large chunk of data into small
tables using a hashing function. It is also known as the message digest function. It is a
technique that uniquely identifies a specific item from a collection of similar items. It uses hash
tables to store the data in an array format.
Each value in the array has assigned a unique index number. Hash tables use a
technique to generate these unique index numbers for each value stored in an array format.
This technique is called the hash technique.

Hash table:

The hash table data structure is merely an array of some fixed size, containing the
keys. A key is a string with an associated value.

Each key is mapped into some number in the range 0 to tablesize-1 and placed in the
appropriate cell. In the following example, tablesize is 5 ie., 0 to 4.

21%5=1 1 21
18%5=3 2 32
32%5=2
3 18

Hash function:

A hash function is a key to address transformation which acts upon a given key to
compute the relative position of the key in an array.

The choice of hash function should be simple and it must distribute the data evenly.

def Hash(int key,int tablesize):


return key%tablesize;

Importance of hashing:

 Maps key with the corresponding value using hash function.


 Hash tables support the efficient addition of new entries and the time spent on
searching for the required data is independent of the number of items stored.
 A hash function is any function that can be used to map data of arbitrary size to data
of fixed size.
 A perfect hash function has no blanks and no collisions.

84
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

1. Division method: The hash function depends upon the remainder of division.

H(key) = record % table size


For ex, Insert 12,16,34.

12%11=1 16%11=5 30%11=8


8
0 1 2 3 4 5 6 7 9 10
12 16 30

2. Mid square: In the mid square method, the key is squared and the middle or mid part of
the result is used as the index.

Consider that if we want to place a record 3111 then for the hash table size 1000
31112 = 9678321
H(3111) = 783 ( the middle 3 digits)

3. Digital Folding: The Key is divided into separate part and using some simple operation
these parts are combined to produce the hash key.

For example, consider a record 12365412


H(key) = 123 + 654 +12
= 789
The record will be placed at location 789 in the hash table.

Collision:

When an element is inserted, it hashes to the same value as an already inserted


element, and then it produces collision.

 Separate chaining or External hashing.


 Open addressing or Closed hashing

Separate Chaining:

Separate chaining is a collision resolution technique to keep the list of all elements that
hash to the same value. This is called separate chaining because each hash table element
is a separate chain (linked list). Each linked list contains all the elements whose keys hash
to the same index.

More number of elements can be inserted as it uses linked lists. For ex, insert
18,54,28,25,41,38,36,12,90.

85
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

In the worst case, operations on an individual bucket take time proportional to the size
of the bucket. Assuming we use a good hash function to index the n items of our map in a
bucket array of capacity N, the expected size of a bucket is n/N.

Therefore, if given a good hash function, the core map operations run in O( n/N). The
ratio λ = n/N, called the load factor of the hash table, should be bounded by a small constant,
preferably below 1. As long as λ is O(1), the core operations on the hash table run in O(1)
expected time.

Advantages of separate chaining:


1. Simple to implement.
2. Hash table never fills up.
3. Less sensitive to the hash function or load factors.
4. It is mostly used when it is unknown how many and how frequently keys may be
inserted or deleted.
Disadvantages of separate chaining:
1. Cache performance of chaining is not good.
2. Wastage of Space.
3. If the chain becomes long, then search time can become O(n) in worst case.
4. Uses extra space for links.

Performance Evaluation of Separate Chaining:


m = Number of slots in hash table.
n = Number of keys to be inserted in has table.
Load factor α = n/m.
Expected time to search = O (1 + α).
Expected time to insert/delete = O (1 + α).
Time complexity of search insert and delete is O (1) if Load Factor (α) is O (1).

86
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Open addressing:

Open addressing is a collision resolving strategy in which, if collision occurs


alternative cells are tried until an empty cell is found. The cells h0(x), h1(x), h2(x) ,…. are
tried in succession, where,

hi(x)=(Hash(x)+F(i))mod Tablesize with F(0)=0.

The function F is the collision resolution strategy.

Open addressing requires that the load factor is always at most 1 and that items are
stored directly in the cells of the bucket array itself.

Collision Resolution Strategy In Open Addressing:

(i) Linear probing - With this approach, if we try to insert an item (k,v) into a bucket A[ j] that
is already occupied, where j = h(k), then we next try A[(j +1) mod N]. If A[(j +1) mod N] is also
occupied, then we try A[(j + 2) mod N], and so on, until we find an empty bucket that can
accept the new item. Once this bucket is located, we simply insert the item there.

Example:

(ii) Quadratic probing - Another open addressing strategy, known as quadratic probing,
iteratively tries the buckets A[(h(k)+ f(i)) mod N], for i = 0,1,2,..., where f(i) = i 2, until finding an
empty bucket. As with linear probing, the quadratic probing strategy complicates the removal
operation, but it does avoid the kinds of clustering patterns that occur with linear probing.

H = (Hash (key)+i2) mod m Where m is a table size or any prime number.

Example: If we have to insert following elements in the hash table with size 10.

37, 90, 55, 22, 17, 49, 87.

(22+12)%11=1
0 1 2 3 4 5 6 7 8 9 10

55 22 90 37 49 17 87

87
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

(iii) Double hashing - in which F(i)=i.hash2(X). This formula says that we apply a second
hash function to X and probe at a distance hash2(X), 2hash2(X),…., and so on.

In this approach, we choose a secondary hash function, h, and if h maps some key k
to a bucket A[h(k)] that is already occupied, then we iteratively try the buckets A[(h(k) + f(i))
mod N] next, for i = 1,2,3,..., where f(i) = i · h (k). In this scheme, the secondary hash function
is not allowed to evaluate to zero; a common choice is h(k) = q−(k mod q), for some prime
number q < N. Also, N should be a prime.

A function such as hash2(X)=R-(X mod R), with R a prime smaller than Tablesize.

Example:

Insert 37, 90, 55, 22, 14 into a hash table with size 7 using Double Hashing method.

Here the prime no chosen is, 5.


7-(22%7)=6
0 1 2 3 4 5 6 7 8 9 10

55 22 90 37 49 22 87

Comparison of above three:

Linear probing has the best cache performance, but suffers from clustering. One
more advantage of Linear probing is easy to compute.

Quadratic probing lies between the two in terms of cache performance and
clustering.

Double hashing has poor cache performance but no clustering. Double hashing
requires more computation time as two hash functions need to be computed.

88
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

2 Mark Questions with Answers


1. Define Sorting.

Sorting is a method of arranging data items in ascending or descending order.


The various methods of sorting are,
 Bubble Sort
 Insertion Sort
 Selection Sort
 Merge Sort
 Quick Sort

2. What is searching?

t is a process of locating an element stored in a file or array.


Different searching methods are,
a. Linear Search(or Sequential Search)
b. Binary Search
Advantage of linear search method:
It is simple and useful when the elements to be searched are not in any definite
order.

3. What are Stability in sorting algorithms?


A sorting algorithm is said to be stable if two objects with equal keys appear in the
same order in sorted output as they appear in the input unsorted array. Some sorting
algorithms are stable by nature like Insertion sort, Merge Sort, Bubble Sort, etc. And some
sorting algorithms are not, like Heap Sort, Quick Sort, etc

3. Specify the time complexity of different sorting algorithm.

Algorithm Time Complexity


Best Average Worst
Selection Sort Ω(n^2) θ(n^2) O(n^2)
Bubble Sort Ω(n) θ(n^2) O(n^2)
Insertion Sort Ω(n) θ(n^2) O(n^2)
Heap Sort Ω(n log(n)) θ(n log(n)) O(n log(n))
Quick Sort Ω(n log(n)) θ(n log(n)) O(n^2)
Merge Sort Ω(n log(n)) θ(n log(n)) O(n log(n))
Bucket Sort Ω(n+k) θ(n+k) O(n^2)
Radix Sort Ω(nk) θ(nk) O(nk)

89
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

4. Specify the space complexity of different sorting algorithm.

5. List the sorting algorithms which uses logarithmic time complexity.

Algorithm Time Complexity

Best Average Worst


Heap Sort Ω(n log(n)) θ(n log(n)) O(n log(n))

Quick Sort Ω(n log(n)) θ(n log(n)) O(n^2)

Merge Sort Ω(n log(n)) θ(n log(n)) O(n log(n))

6. What is the time complexity of linear and binary search?


Linear Search (sorted/unsorted)
 Worst-case performance - O(n)
 Best-case performance - O(1)
 Average performance - O(n)
 Worst-case space complexity - O(1) iterative
Binary Search (sorted/unsorted)
 Worst-case performance - O(log n)
 Best-case performance - O(1)
 Average performance - O(log n)
 Worst-case space complexity - O(1)

7. What is in-place sorting?


An in-place sorting algorithm uses constant extra space even for producing the output
ie., modifies the given array only. For example, Insertion Sort and Selection Sorts are in-place
sorting algorithms and a typical implementation of Merge Sort is not in-place.

8. What are Internal and External Sorting?


When all data that needs to be sorted cannot be placed in-memory at a time, the
sorting is called external sorting. External Sorting is used for massive amount of data. Merge
Sort and its variations are typically used for external sorting. Some external storage like hard-
disk, CD, etc is used for external storage. When all data is placed in-memory, then sorting is
called internal sorting.

90
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

9. Define Hashing.

Hashing is the process of mapping large amount of data item to a smaller table with
the help of a hashing function. Modulo operator is used to get the key value from the actual
data/information.

For ex, Insert 12,16,34.


0
1
2
3
12 %10 =2
4
16%10=6
5
34%10=4
6
7
8
9

10. What do you mean by hash table?

The hash table data structure is merely an array of some fixed size, containing the
keys. A key is a string with an associated value.

Each key is mapped into some number in the range 0 to tablesize-1 and placed in the
appropriate cell. In the following example, tablesize is 5 ie., 0 to 4.

21%5=1 21
18%5=3 18
32%5=2 32

11. What do you mean by hash function?

A hash function is a key to address transformation which acts upon a given key to
compute the relative position of the key in an array.

The choice of hash function should be simple and it must distribute the data evenly.

Index Hash(int key,int tablesize)


{
return key%tablesize;
}

91
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

12. Write the importance of hashing.

 Maps key with the corresponding value using hash function.


 Hash tables support the efficient addition of new entries and the time spent on
searching for the required data is independent of the number of items stored.
 A hash function is any function that can be used to map data of arbitrary size to data
of fixed size.
 A perfect hash function has no blanks and no collisions.

13. What do you mean by collision in hashing? Name some collision resolution
techniques.

When an element is inserted, it hashes to the same value as an already inserted


element, and then it produces collision.
 Separate chaining or External hashing.
 Open addressing or Closed hashing

14. What do you mean by separate chaining?

Separate chaining is a collision resolution technique to keep the list of all elements
that hash to the same value. This is called separate chaining because each hash table
element is a separate chain (linked list). Each linked list contains all the elements whose keys
hash to the same index.

More number of elements can be inserted as it uses linked lists. For ex, insert
12,17,22,24.

0
12%5=>2
1
17%5=>2
2
12 12 12
22%5=>2 3
24%5=>4 4
12

15. Give the Performance Evaluation of Separate Chaining.

m = Number of slots in hash table


n = Number of keys to be inserted in has table
Load factor α = n/m
Expected time to search = O(1 + α)
Expected time to insert/delete = O(1 + α)
Time complexity of search insert and delete is O(1) if α is O(1)

92
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

16. List some advantages and disadvantages of separate chaining?


Advantage:
1. Simple to implement.
2. Hash table never fills up, we can always add more elements to chain.
3. Less sensitive to the hash function or load factors.
4. It is mostly used when it is unknown how many and how frequently keys may be
inserted or deleted.
Disadvantages of separate chaining.
1. Cache performance of chaining is not good as keys are stored using linked list. Open
addressing provides better cache performance as everything is stored in same table.
2. Wastage of Space (Some Parts of hash table are never used)
3. If the chain becomes long, then search time can become O(n) in worst case.
4. Uses extra space for links.

17. What do you mean by open addressing?

Open addressing is a collision resolving strategy in which, if collision occurs


alternative cells are tried until an empty cell is found. The cells h0(x), h1(x), h2(x),…. are tried
in succession, where,

hi(x)=(Hash(x)+F(i))mod Tablesize with F(0)=0.


The function F is the collision resolution strategy.
Let us consider a simple hash function as “key mod 7” and sequence of keys as 50,
700, 76, 85, 92, 73, 101.

18. What do you mean by primary clustering?

In linear probing collision resolution strategy, even if the table is relatively empty,
blocks of occupied cells start forming. This effect is known as primary clustering means that
any key hashes into the cluster will require several attempts to resolve the collision and then
it will add to the cluster.

93
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

19. What are the types of collision resolution strategies in open addressing?

Linear probing - In which F is a linear function of i, F(i)=i. This amounts to trying


sequentially in search of an empty cell. If the table is big enough, a free cell can always be
found, but the time to do so can get quite large.

Quadratic probing - If collision occurs, alternative cells are tried until an empty cell
is found. In linear probing method, the hash table is represented one-dimensional array with
indices that range from 0 to the desired table.

In Quadratic probing the alternative cells are calculated using the formula, F(i) = i2.

H = (Hash(key)+i2) mod m Where m is a table size or any prime number.

Double hashing - in which F(i)=i.hash2(X). This formula says that we apply a second
hash function to X and probe at a distance hash2(X), 2hash2(X),….,and so on. A function
such as hash2(X)=R-(XmodR), with R a prime smaller than Tablesize.

Comparison of above three:

 Linear probing has the best cache performance, but suffers from clustering. One more
advantage of Linear probing is easy to compute.
 Quadratic probing lies between the two in terms of cache performance and clustering.
 Double hashing has poor cache performance but no clustering. Double hashing
requires more computation time as two hash functions need to be computed.

20. What do you mean by secondary clustering?

Although quadratic probing eliminates primary clustering, elements that hash to the
same position will probe the same alternative cells. This is known as secondary clustering.

21. What do you mean by rehashing?

Building another table that is about twice as big with the associated new hash function
and scan down the entire original hash table, computing the new hash value for each element
and inserting it in the new table. This entire operation is called rehashing.

Advantage:

 Table size is not a problem.


 Hash tables cannot be made arbitrarily large.
 Rehashing can be used in other data structures.
Disadvantage:

 It is a very expensive operation.


 The running time is 0(N).
 Slowing down of rehashing method.

94
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

22. What is the need for extendible hashing?

If either open addressing hashing or separate chaining hashing is used, the major
problem is that collisions could cause several blocks to be examined during a Find, even for
a well-distributed hash table. Extendible hashing allows a find to be performed in two disk
accesses. Insertions also require few disk accesses.

23. List the limitations of linear probing.

Linear probing - In which F is a linear function of i, F(i)=i. This amounts to trying


sequentially in search of an empty cell. If the table is big enough, a free cell can always be
found, but the time to do so can get quite large.

Limitations of linear probing:

• Time taken for finding the next available cell is large.


• In linear probing, we come across a problem known as clustering.

24. Mention one advantage and disadvantage of using quadratic probing.

Advantage: The problem of primary clustering is eliminated.

Disadvantage: There is no guarantee of finding an unoccupied cell once the table is


nearly half full.

25. Give some advantages of Open Addressing and Separate Chaining.


Advantages of Chaining:

1) Chaining is Simpler to implement.


2) In chaining, Hash table never fills up, we can always add more elements to chain.
In open addressing, table may become full.
3) Chaining is Less sensitive to the hash function or load factors.
4) Chaining is mostly used when it is unknown how many and how frequently keys
may be inserted or deleted.
5) Open addressing requires extra care for to avoid clustering and load factor.

Advantages of Open Addressing:


1) Cache performance of chaining is not good as keys are stored using linked list.
Open addressing provides better cache performance as everything is stored in same
table.
2) Wastage of Space (Some Parts of hash table in chaining are never used). In Open
addressing, a slot can be used even if an input doesn’t map to it.
3) Chaining uses extra space for links.

95
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

26. List out the applications of hashing.

1. DBMS. 2. Computer networks

3. Storage of secret data 4. Cryptography.

5. Securing the database 6. Database applications

7. Storing data in a database

27. What is linear searching? What is the time complexity?

 A linear search scans one item at a time, without jumping to any item.
 The worst case complexity is O(n), sometimes known an O(n) search
 Time taken to search elements keep increasing as the number of elements are
increased.

28. What is Binary search?

Binary Search is a searching algorithm for finding an element's position in a sorted


array by repeatedly dividing the search interval in half.
Implementation
• Iterative Method
• Recursive Method

29. What are the Applications of Binary search?


 In libraries of Java, .Net, C++ STL
 While debugging, the binary search is used to pinpoint the place where the error
happens.

30. Why Hashing is needed?

 After storing a large amount of data. Linear search and binary search perform
lookups/search with time complexity of O(n) and O(log n) respectively.
 As the size of the dataset increases, these complexities also become significantly
high which is not acceptable.
 We need a technique that does not depend on the size of data. Hashing allows
lookups to occur in constant time i.e. O(1).

96
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

UNIT IV TREE STRUCTURES


Tree ADT – Binary Tree ADT – tree traversals – binary search trees – AVL trees
– heaps – multiway search trees.

Tree ADT:
Tree is an abstract data type that stores elements hierarchically. With the exception of
the top element, each element in a tree has a parent element and zero or more children
elements.
A tree is usually visualized by placing elements inside ovals or rectangles, and by
drawing the connections between parents and children with straight lines.
Formal Tree Definition
Formally, we define a tree T as a set of nodes storing elements such that the nodes
have a parent-child relationship that satisfies the following properties:
 If T is nonempty, it has a special node, called the root of T, which has no parent.
 Each node v of T different from the root has a unique parent node w; every node with
parent w is a child of w.

Node Relationships
 A node v is external, if v has no children. External nodes are also known as leaves.
 A node v is internal if it has one or more children.
 A node u is an ancestor of a node v, if u = v or u is an ancestor of the parent of v.
 Conversely, we say that a node v is a descendant of a node u if u is an ancestor of v.
 A tree is ordered if there is a meaningful linear order among the children of each node;
 Path: Path refers to the sequence of nodes along the edges of a tree.
 Root: The node at the top of the tree is called root. There is only one root per tree
and one path from the root node to any node.
 Parent: Any node except the root node has one edge upward to a node called parent.
 Child: The node below a given node connected by its edge downward is called its
child node.
 Sub tree: Sub tree represents the descendants of a node.
 Traversing: Traversing means passing through nodes in a specific order.
 Levels: Level of a node represents the generation of a node. If the root node is at level
0, then its next child node is at level 1, its grandchild is at level 2, and soon.
 Keys: Key represents a value of a node based on which a search operation is to be
carried out for a node.
 Siblings: All the nodes that share the same parent are called siblings.
 Depth: The depth of a node N is the length of the path from the root to the node N
 Height: The Height of a node N is the length of the path from the node to the deepest
leaf.

97
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Properties of Tree:
 Every tree has a special node called the root node. The root node can be used to
traverse every node of the tree. It is called root because the tree originated from root
only.
 If a tree has N vertices(nodes) than the number of edges is always one less than the
number of nodes(vertices) i.e N-1. If it has more than N-1 edges it is called a graph not
a tree.
 Every child has only a single Parent but Parent can have multiple child.

Edges and Paths in Trees


 An edge of tree T is a pair of nodes (u,v) such that u is the parent of v, or vice versa.
 A path of T is a sequence of nodes such that any two consecutive nodes in the
sequence form an edge.

Example

The Tree Abstract Data Type


A tree ADT using the concept of a position as an abstraction for a node of a tree. An
element is stored at each position, and positions satisfy parent-child relationships that define
the tree structure.

Computing Depth and Height


Depth
Let p be the position of a node of a tree T. The depth of p is the number of ancestors
of p, excluding p itself.
 If p is the root, then the depth of p is 0.
 Otherwise, the depth of p is one plus the depth of the parent of p.
def depth(self, p):
if self.is root(p):
return 0
else:
return 1 + self.depth(self.parent(p))

98
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Height
The height of a position p in a tree T is also defined recursively:
 If p is a leaf, then the height of p is 0.
 Otherwise, the height of p is one more than the maximum of the heights of p’s
children. The height of a nonempty tree T is the height of the root of T.
def height(self, p):
if self.is leaf(p):
return 0
else:
return 1 + max(self. height2(c) for c in self.children(p))

The performance of the linked structure implementation of a binary tree is,

Types of Tree:
1. Binary Tree
Binary tree is the type of tree in which each parent can have at most two children. The
children are referred to as left child or right child.

2. Binary Search Tree


Binary Search Tree (BST) is an extension of Binary tree with some added constraints.
In BST, the value of the left child of a node must be smaller than or equal to the value of its
parent and the value of the right child is always larger than or equal to the value of its parent.

99
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

3. AVL Tree:
AVL tree is a self-balancing binary search tree. In AVL tree, the heights of children of
a node differ by at most 1. The valid balancing factor in AVL tree are 1, 0 and -1. When a new
node is added to the AVL tree and tree becomes unbalanced then rotation is done to make
sure that the tree remains balanced.

4. B-tree
B-tree is another self-balancing search tree that comprises many nodes to keep data
stored in a particular order. Each node has over two child nodes and each node comprises
multiple keys. B-trees are compatible with file systems and databases that can write and read
larger blocks of data.

5. N-ary Tree:
In an N-ary tree, the maximum number of children that a node can have is limited to
N. A binary tree is 2-ary tree as each node in binary tree has at most 2 children. Trie data
structure is one of the most commonly used implementation of N-ary tree. A full N-ary tree is
a tree in which children of a node is either 0 or N. A complete N-ary tree is the tree in which
all the leaf nodes are at the same level.

Advantages of Tree:
 The tree reflects the data structural connections.
 The tree is used for hierarchy.
 It offers an efficient search and insertion procedure.
 The trees are flexible. This allows subtrees to be relocated with minimal effort.

100
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Tree Traversal:
Traversal is a process to visit all the nodes of a tree and may print their values too.
Because, all nodes are connected via edges (links) we always start from the root (head) node.
That is, we cannot randomly access a node in a tree. There are three ways which we use to
traverse a tree −
• In-order Traversal
• Pre-order Traversal
• Post-order Traversal
Generally, we traverse a tree to search or locate a given item or key in the tree or to
print all the values it contains.

In-order Traversal
In this traversal method, the left sub tree is visited first, then the root and later the right
sub-tree. We should always remember that every node may represent a sub tree itself.
If a binary tree is traversed in-order, the output will produce sorted key values in an
ascending order.

We start from A, and following in-order traversal, we move to its left subtree B. B is
also traversed in- order. The process goes on until all the nodes are visited. The output of
inorder traversal of this tree wills be−

D→B→E→A→F→C→G
Algorithm
Until all nodes are traversed −
Step 1 − Recursively traverse left subtree.
Step 2 − Visit root node.
Step 3 − Recursively traverse right subtree.
Python Code for inorder traversal:
def Inorder(self):
if self.left:
self.left.Inorder()
print( self.data)
if self.right:
self.right.Inorder()

101
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Pre-order Traversal
In this traversal method, the root node is visited first, then the left subtree and finally
the right subtree.

We start from A, and following pre-order traversal, we first visit A itself and then move
to its left subtree B. B is also traversed pre-order. The process goes on until all the nodes are
visited. The output of pre- order traversal of this tree willbe−

A → B → D → E → C → F →G
Algorithm
Until all nodes are traversed −
Step 1 − Visit root node.
Step 2 − Recursively traverse left subtree.
Step 3 − Recursively traverse right subtree.

Python Code for preorder traversal:


def preorder(self):
print( self.data)
if self.left:
self.left.preorder()
if self.right:
self.right.preorder()

Post-order Traversal
In this traversal method, the root node is visited last, hence the name. First we traverse
the left subtree, then the right subtree and finally the root node.

102
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

We start from A, and following Post-order traversal, we first visit the left subtree B. B
is also traversed post-order. The process goes on until all the nodes are visited. The output of
post-order traversal of this tree will be −

D→E→B→F→G→C→A
Algorithm
Until all nodes are traversed −
Step 1 − Recursively traverse left subtree.
Step 2 − Recursively traverse right subtree.
Step 3 − Visit root node.

Python Code for postorder traversal:


def postorder(self):
if self.left:
self.left.postorder()
if self.right:
self.right.postorder()
print( self.data)

Preorder Traversal Inorder Traversal Post Order Traversal


Root-Left-Right Left-Root-Right Left-Right-Root
[A-B-D-E-C-F-G] [D-B-E-A-F-C-G] [D-E-B-F-G-C-A]

103
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Binary Tree:
A binary tree is an ordered tree with the following properties:
1. Every node has at most two children.
2. Each child node is labeled as being either a left child or a right child.
3. A left child precedes a right child in the order of children of a node.

The subtree rooted at a left or right child of an internal node v is called a left subtree
or right subtree, respectively, of v.

Binary Tree Terminologies:


 Root: Topmost node in a tree.
 Parent: Every node (excluding a root) in a tree is connected by a directed edge from
exactly one other node. This node is called a parent.
 Child: A node directly connected to another node when moving away from the root.
 Leaf/External node: Node with no children.
 Internal node: Node with atleast one children.
 Depth of a node: Number of edges from root to the node.
 Height of a node: Number of edges from the node to the deepest leaf. Height of the
tree is the height of the root.
 Sibling: Nodes with the same parent are called siblings.

Types of Binary Tree:


 Complete binary tree: It is a binary tree in which every level, except possibly the
last, is completely filled, and all nodes are as far left as possible.
 Full Binary Tree: if each node has either zero or two children.
 In a proper binary tree, every internal node has exactly two children.
 Perfect binary tree: It is a binary tree in which all interior nodes have two children
and all leaves have the same depth or same level.

Complete Binary Tree Full Binary Tree Perfect Binary Tree

104
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Decision trees:
To represent a number of different outcomes that can result from answering a series
of yes-or-no questions. Each internal node is associated with a question. Starting at the root,
go to the left or right child of the current node, depending on whether the answer to the
question is “Yes” or “No.” Such binary trees are known as decision trees.

Decision Trees
Arithmetic expression tree:
An arithmetic expression can be represented by a binary tree whose leaves are
associated with variables or constants, and whose internal nodes are associated with one
of the operators +, −, ×, and /. Such tree is called arithmetic expression tree.

Arithmetic Expression Tree


Properties of Binary Tree

105
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Finding Height of Binary Tree:


In a tree data structure, the number of edges from the leaf node to the particular node
in the longest path is known as the height of that node. In the tree, the height of the root node
is called "Height of Tree". Height of leaf node is always 0.

Example:

Algorithm:
def height(self,root):
if root is None:
return 0;
l=self.height(root.left)
r=self.height(root.right)
return max(l,r)+1

Finding parent of a node:


The parent of a node is the node whose leftChild reference or rightChild reference is
pointing to the current node.
Example:

Parent of 2 is 1. Parent of 5 is 2.

Algorithm:
def parent(self,data):
if self.data==data:
print(data,"is the root")
elif self.left.data==data or self.right.data==data:
print(self.data)
elif data<self.data and self.left is not None:
self.left.parent(data)
elif data>self.data and self.right is not None:
self.right.parent(data)
else:
print("No such data")

106
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Inserting a New node:


For inserting a node in a binary tree you will have to check the following conditions:
 If a node in the binary tree does not have its left child, then insert the given node (the
one that we have to insert) as its left child.
 If a node in the binary tree does not have its right child then insert the given node as
its right child.
 If the above-given conditions do not apply then search for the node which does not
have a child at all and insert the given node there.

Example:

Binary Tree After inserting 7.

Algorithm:
def insert(self,data):
if self.data:
if data<self.data:
if self.left is None:
self.left=Node(data)
else:
self.left.insert(data)
elif data>self.data:
if self.right is None:
self.right=Node(data)
else:
self.right.insert(data)
else:
self.data = data

Finding / Searching a node:


Procedure:
a) It checks whether the root is null, which means the tree is empty.

b) If the tree is not empty, it will compare root’s data with value. If they are equal, it
will set the flag to true and return.

c) Traverse left subtree by calling searchNode() recursively and check whether the
value is present in left subtree.

d) Traverse right subtree by calling searchNode() recursively and check whether the
value is present in the right subtree.

107
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Algorithm:
def search(self,data):
if self.data==data:
print("Data found")
elif(data<self.data and self.left is not None):
self.left.search(data)
elif(data>self.data and self.right is not None):
self.right.search(data)
else:
print("Not Found")

108
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Binary Search Tree:

Ordered sequence of elements in a binary tree, is called binary search tree. A binary
search tree for S is a binary tree T such that, for each position p of T:

 Position p stores an element of S, denoted as e(p).


 Elements stored in the left subtree of p (if any) are less than e(p).
 Elements stored in the right subtree of p (if any) are greater than e(p).

Figure: A binary search tree with integer keys.

Navigating (Traversing) a Binary Search Tree:

Binary search tree hierarchically represents the sorted order of its keys. An inorder
traversal of a binary search tree visits positions in increasing order of their keys.

Algorithm for Traversing:


def inorder(self):
if self.data is not None:
if self.left is not None:
self.left.inorder()
print(self.data)
if self.right is not None:
self.right.inorder()

Finding / Searching a data in Binary Search Tree:


Searching in Binary Search Tree is based on decisions. A question is asked in each
position p, if key is less than p, search in left sub tree. Otherwise search in right sub tree. If
there is a match it will return. Or else, if we reach an empty subtree, then the search terminates
unsuccessfully.

Algorithm for Searching:


def search(self,data):
if self.data==data:

109
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

print("Data found")
elif(data<self.data and self.left is not None):
self.left.search(data)
elif(data>self.data and self.right is not None):
self.right.search(data)
else:
print("Not Found")

Analysis of Binary Tree Searching

Algorithm TreeSearch is recursive and executes a constant number of primitive


operations for each recursive call. Each recursive call of TreeSearch is made on a child of the
previous position. That is, TreeSearch is called on the positions of a path of T that starts at
the root and goes down one level at a time. Thus, the number of such positions is bounded by
h+1, where h is the height of T. In other words, since we spend O(1) time per position
encountered in the search, the overall search runs in O(h) time, where h is the height of the
binary search tree T.

Insertion in Binary Search Tree:


If there is no root, it will place the data in the root. If the data is less than the data, it
will find appropriate left side position to place the data. Otherwise, it will find the right side
position to place the data.

Algorithm for Binary Search Tree Insertion

def insert(self,data):

110
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

if self.data:
if(data<self.data and self.left is None):
self.left=BinarySearchTree(data)
elif(self.left is not None):
self.left.insert(data)
elif(data>self.data and self.right is None):
self.right=BinarySearchTree(data)
elif(self.right is not None):
self.right.insert(data)
else:
self.data=data

eletion in Binary Search Tree

To delete an item with key k, we begin by calling TreeSearch(T, T.root( ), k) to find the
position p of T storing an item with key equal to k. If the search is successful, we distinguish
between three cases
 Input the data of the node to be deleted.
 Case 1: If the node is a leaf node, delete the node directly.
 Case 2: Else if the node has one child, copy the child to the node to be deleted and
delete the child node.
 Case 3: Else if the node has two children, find the inorder successor of the node.
 Copy the contents of the inorder successor to the node to be deleted and delete
the inorder successor.

Figure: Before and After deleting 7

111
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Algorithm for Deletion in BST:


def delete(self, root):
if self.root == None:
return self. root
if value == self. root.value:
if self.left == None:
self.data = self.data.right
elif self.data.right == None:
self.data = self.data.left
else:
self. root = findsuccessor(self. root)
self. root.right = delete(self. root.right, self. root.value)
elif value < self. root.value:
self. root.left = self.delete(self. root.left, value)
else:
self. root.right = self.delete(self. root.right, value)
return self. root

#Python Code for Binary Search Tree and its operations:

class BinarySearchTree:
def init (self,data):
self.left=None
self.data=data
self.right=None
self.root=self.data
# 4,3,5,1,2

def insert(self,data):
if self.data:
if(data<self.data and self.left is None):
self.left=BinarySearchTree(data)
elif(self.left is not None):
self.left.insert(data)
elif(data>self.data and self.right is None):
self.right=BinarySearchTree(data)
elif(self.right is not None):

112
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

self.right.insert(data)
else:
self.data=data

def Search(self, data):


if(self.data==data):
print(self.data,"is present")
elif(data<self.data and self.left is not None):
self.left.Search(data)
elif(data>self.data and self.right is not None):
self.right.Search(data)
else:
print("Not Present")

def findMin(self):
if self.data:
if self.left is not None:
self.left.findMin()
else:
print(self.data)
else:
print("Tree Not Found")

def findMax(self):
if self.data:
if self.right is not None:
self.right.findMin()
else:
print(self.data)
else:
print("Tree Not Found")

def parent(self,data):
if self.data==data:
print(data,"is the root")
elif self.left.data==data or self.right.data==data:
print(self.data)
elif data<self.data and self.left is not None:
self.left.parent(data)

113
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

elif data>self.data and self.right is not None:


self.right.parent(data)
else:
print("No such data")

def inorder(self):
if self.data is not None:
if self.left is not None:
self.left.inorder()
print(self.data)
if self.right is not None:
self.right.inorder()

114
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

AVL Tree

Height-Balance Property: For every position p of T, the heights of the children of p


differ by at most 1.
Any binary search tree T that satisfies the height-balance property is said to be an
AVL tree, named after the initials of its inventors: Adel’son-Vel’skii and Landis.

A tree is called an AVL tree if each node of the tree possesses one of the following properties:

 A node is called left heavy if the longest path in its left subtree is one longer than the
longest path of its right subtree
 A node is called right heavy if the longest path in the right subtree is one longer than
the path in its left subtree
 A node is called balanced if the longest path in both the right and left subtree are equal.

AVL tree is a height-balanced tree where the difference between the heights of the
right subtree and left subtree of every node is either -1, 0 or 1. The difference between the
heights of the subtree is maintained by a factor named as balance factor. Therefore, we can
define AVL as it is a balanced binary search tree where the balance factor of every node in
the tree is either -1, 0, or +1. Here, the balance factor is calculated by the formula:

Balance Factor = Height_Of_Left_Subtree – Height_Of_Right_Subtree

As AVL is the height-balanced tree, it helps to control the height of the binary search
tree and further help the tree to prevent skewing. When the binary tree gets skewed, the
running time complexity becomes the worst-case scenario i.e O(n) but in the case of the AVL
tree, the time complexity remains O(logn). Therefore, it is always advisable to use an AVL tree
rather than a binary search tree.

Every AVL Tree is a binary search tree but every Binary Search Tree need not be AVL
Tree.

115
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

AVL Rotation

When certain operations like insertion and deletion are performed on the AVL tree, the
balance factor of the tree may get affected. If after the insertion or deletion of the element, the
balance factor of any node is affected then this problem is overcome by using rotation.
Therefore, rotation is used to restore the balance of the search tree. Rotation is the method of
moving the nodes of trees either to left or to right to make the tree heighted balance tree.

There are total two categories of rotation which is further divided into two further parts:

1) Single Rotation
Single rotation switches the roles of the parent and child while maintaining the search
order. We rotate the node and its child, the child becomes a parent.

Single LL(Left Left) Rotation


Here, every node of the tree moves towards the right from its current position.
Therefore, a parent becomes the right child in LL rotation. Let us see the below examples

#Python Code for Rotation with left data


def lRotate(self, z):
y = z.left
z.left = y.right
y.right = z
z.height = 1 + max(self.getHeight(z.left),self.getHeight(z.right))
y.height = 1 + max(self.getHeight(y.left),self.getHeight(y.right))
return y

Single RR(Right Right) Rotation


Here, every node of the tree moves towards the left from the current position.
Therefore, the parent becomes a left child in RR rotation. Let us see the below example

116
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

#Python code for rotation with right data


def rRotate(self, z):
y = z.right
z.right = y.left
y.left = z
z.height = 1 + max(self.getHeight(z.left),self.getHeight(z.right))
y.height = 1 + max(self.getHeight(y.left),self.getHeight(y.right))
return y

2) Double Rotation
Single rotation does not fix the LR rotation and RL rotation. For this, we require double
rotation involving three nodes. Therefore, double rotation is equivalent to the sequence of two
single rotations.

LR(Left-Right) Rotation
The LR rotation is the process where we perform a single left rotation followed by a
single right rotation. Therefore, first, every node moves towards the left and then the node of
this new tree moves one position towards the right. Let us see the below example

def DoubleRotateWithLeft(self, z):


z.left = rRotate(z.left);
return (lRotate(z))

RL (Right-Left) Rotation

117
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

The RL rotation is the process where we perform a single right rotation followed by a
single left rotation. Therefore, first, every node moves towards the right and then the node of
this new tree moves one position towards the left. Let us see the below example

def DoubleRotateWithRight(self, z):


z.right = lRotate(z.right);
return (rRotate(z))

118
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Operations In AVL Tree


There are 2 major operations performed on the AVL tree

1. Insertion Operation
2. Deletion Operation

Let us study them one by one in detail

Insertion Operation In AVL Tree


In the AVL tree, the new node is always added as a leaf node. After the insertion of the
new node, it is necessary to modify the balance factor of each node in the AVL tree using the
rotation operations. The algorithm steps of insertion operation in an AVL tree are:

1. Find the appropriate empty subtree where the new value should be added by
comparing the values in the tree
2. Create a new node at the empty subtree
3. The new node is a leaf ad thus will have a balance factor of zero
4. Return to the parent node and adjust the balance factor of each node through the
rotation process and continue it until we are back at the root. Remember that the
modification of the balance factor must happen in a bottom-up fashion

Example:
The root node is added as shown in the below figure

The node to the root node is added as shown below. Here the tree is balanced

Then, The right child is added to the parent node. Here, the balance factor of the tree is
changed, therefore, the LL rotation is performed and the tree becomes a balanced tree

119
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Later, one more right child is added to the new tree as shown below

Again further, one more right child is added and the balance factor of the tree is changed.
Therefore, again LL rotation is performed on the tree and the balance factor of the tree is
restored as shown in the below figure

#Python code for Insertion in AVL Tee


def insert(self, root, key):
if not root:
return treeNode(key)
elif key < root.value:
root.left=self.insert(root.left, key)
else:
root.right=self.insert(root.right, key)
root.height = 1 + max(self.getHeight(root.left),self.getHeight(root.right))
b = self.getBF(root)

if b > 1 and key < root.left.value:

120
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

return self.lRotate(root)
if b < -1 and key > root.right.value:
return self.rRotate(root)
if b > 1 and key > root.left.value:
root.left = self.rRotate(root.left)
return self.lRotate(root)
if b < -1 and key < root.right.value:
root.right = self.lRotate(root.right)
return self.rRotate(root)
return root
Deletion Operation In AVL

The deletion operation in the AVL tree is the same as the deletion operation in BST. In
the AVL tree, the node is always deleted as a leaf node and after the deletion of the node, the
balance factor of each node is modified accordingly. Rotation operations are used to modify
the balance factor of each node. The algorithm steps of deletion operation in an AVL tree are:

1. Locate the node to be deleted


2. If the node does not have any child, then remove the node
3. If the node has one child node, replace the content of the deletion node with the child
node and remove the node.
4. If the node has two children nodes, find the inorder successor node ‘k' which has no
child node and replace the contents of the deletion node with the ‘k’ followed by
removing the node.
5. Update the balance factor of the AVL tree

Example:
Let us consider the below AVL tree with the given balance factor as shown in the
figure below

Here, we have to delete the node '25' from the tree. As the node to be deleted does
not have any child node, we will simply remove the node from the tree

121
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

After removal of the tree, the balance factor of the tree is changed and therefore, the
rotation is performed to restore the balance factor of the tree and create the perfectly balanced
tree.

#Python code for AVL TREE

class treeNode:

def init (self, value):


self.value = value
self.left = None
self.right = None
self.height = 1

class AVLTree:

def insert(self, root, key):


if not root:
return treeNode(key)
elif key < root.value:
root.left=self.insert(root.left, key)
else:
root.right=self.insert(root.right, key)
root.height = 1 + max(self.getHeight(root.left),self.getHeight(root.right))

b = self.getBal(root)
if b > 1 and key < root.left.value:
return self.rRotate(root)
if b < -1 and key > root.right.value:
return self.lRotate(root)
if b > 1 and key > root.left.value:
root.left = self.lRotate(root.left)
return self.rRotate(root)
if b < -1 and key < root.right.value:
root.right = self.rRotate(root.right)

122
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

return self.lRotate(root)
return root

def lRotate(self, z):


y = z.right
T2 = y.left
y.left = z
z.right = T2
z.height = 1 + max(self.getHeight(z.left),self.getHeight(z.right))
y.height = 1 + max(self.getHeight(y.left),self.getHeight(y.right))
return y

def rRotate(self, z):


y = z.left
T3 = y.right
y.right = z
z.left = T3
z.height = 1 + max(self.getHeight(z.left),self.getHeight(z.right))
y.height = 1 + max(self.getHeight(y.left),self.getHeight(y.right))
return y

def DoubleRotateWithLeft(self, z):


z.left = rRotate(z.left);
return (lRotate(z))

def DoubleRotateWithRight(self, z):


z.right = lRotate(z.right);
return (rRotate(z))

def getHeight(self, root):


if not root:
return 0
return root.height

def getBal(self, root):


if not root:
return 0
return self.getHeight(root.left) - self.getHeight(root.right)

123
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

def preOrder(self, root):


if not root:
return
print("{0} ".format(root.value), end="")
self.preOrder(root.left)
self.preOrder(root.right)

Tree = AVLTree()
root = None
root = Tree.insert(root, 1)
root = Tree.insert(root, 2)
root = Tree.insert(root, 3)
root = Tree.insert(root, 4)
root = Tree.insert(root, 5)
root = Tree.insert(root, 6)

print("Preorder traversal of the",”constructed AVL tree is")


Tree.preOrder(root)
print()

124
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Heap:
Heap is a data structure that follows a complete binary tree's property and satisfies the
heap property. Therefore, it is also known as a binary heap. As we all know, the complete
binary tree is a tree with every level filled and all the nodes are as far left as possible. In the
binary tree, it is possible that the last level is empty and not filled.
In the heap data structure, we assign key-value or weight to every node of the tree.
Now, the root node key value is compared with the children’s nodes and then the tree is
arranged accordingly into two categories i.e., max-heap and min-heap.

Heapify:
The process of creating a heap data structure using the binary tree is called Heapify.
The heapify process is used to create the Max-Heap or the Min-Heap.

Using this array, we will create the complete binary tree

Min Heap
When the value of each internal node is smaller than the value of its children node then
it is called the Min-Heap Property. Also, in the min-heap, the value of the root node is the
smallest among all the other nodes of the tree. Therefore, if “a” has a child node “b” then

Key(a) < key(b)


represents the Min Heap Property. Let us display the max heap using an array.
Therefore, the root node will be arr[0]. So, for kth node i.e., arr[k]:

 arr[(k - 1)/2] will return the parent node

 arr[(2*k) + 1] will return left child

 arr[(2*k) + 2] will return right child

#Python Code

def min_heapify(A,k):
l = left(k)
r = right(k)

125
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

if l < len(A) and A[l] < A[k]:


smallest = l
else:
smallest = k
if r < len(A) and A[r] < A[smallest]:
smallest = r
if smallest != k:
A[k], A[smallest] = A[smallest], A[k]
min_heapify(A, smallest)

def left(k):
return 2 * k + 1

def right(k):
return 2 * k + 2

def build_min_heap(A):
n = int((len(A)//2)-1)
for k in range(n, -1, -1):
min_heapify(A,k)

A = [3,9,2,1,4,5]
build_min_heap(A)
print(A)

Max Heap
When the value of each internal node is greater than the value of its children node then
it is called the Max-Heap Property. Also, in the max-heap, the value of the root node is the
greatest among all the other nodes of the tree. Therefore, if “a” has a child node “b” then

Key(a) > key(b)


represents the Max Heap Property. Let us display the max heap using an array.
Therefore, the root node will be arr[0]. So, for kth node i.e., arr[k]:

 arr[(k - 1)/2] will return the parent node

 arr[(2*k) + 1] will return left child

 arr[(2*k) + 2] will return right child,

126
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

#Python Code

def max_heapify(A,k):
l = left(k)
r = right(k)
if l < len(A) and A[l] > A[k]:
max = l
else:
max = k
if r < len(A) and A[r] > A[max]:
max = r
if max != k:
A[k], A[max] = A[max], A[k]
max_heapify(A, max)

def left(k):
return 2 * k + 1

def right(k):
return 2 * k + 2

def build_max_heap(A):
n = int((len(A)//2)-1)
for k in range(n, -1, -1):
max_heapify(A,k)

A = [3,9,2,1,4,5]
build_max_heap(A)
print(A)

Time complexity

The running time complexity of the building heap is O(n log(n)) where each call for
heapify costs O(log(n)) and the cost of building heap is O(n). Therefore, the overall time
complexity will be O(n log(n)).

Applications of Heap

 Heap is used while implementing priority queue


 Heap is used in Heap sort
 Heap data structure is used while working with Dijkstra's algorithm
 We can use max-heap and min-heap in the operating system for the job scheduling
algorithm
 It is used in the selection algorithm
 Heap data structure is used in graph algorithms like prim’s algorithm
 It is used in order statistics
 Heap data structure is used in k-way merge

127
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Multiway Search Tree:


A multiway search tree or (2-4) Search tree is one with nodes that have two or more
children. Each internal nodes may have more than two children. Root can have maximum of
two children.
Properties:
 Each internal node of T has at least two children. That is, each internal node is
a d-node such that d ≥ 2.
 Each internal d-node w of T with children c1,...,cd stores an ordered set of d −1
key-value pairs (k1,v1),..., (kd−1,vd−1), where k1 ≤···≤ kd−1.
 Let us conventionally define k0 = −∞ and kd = +∞. For each item (k,v) stored ata
node in the subtree of w rooted at ci, i = 1,...,d, we have that ki−1 ≤ k ≤ ki.
Example:

Searching in a Multiway Tree:


Perform such a search by tracing a path in T starting at the root. When we are at a d-
node w during this search, we compare the key k with the keys k1,...,kd−1 stored at w. If k =
ki for some i, the search is successfully completed. Otherwise, we continue the search in the
child ci of w such that ki−1 < k < ki.

(2,4)-Tree Operations:
A multiway search tree that keeps the secondary data structures stored at each node
small and also keeps the primary multiway tree balanced is the (2,4) tree, which is sometimes
called a 2-4 tree or 2-3-4 tree. This data structure achieves these goals by maintaining two
simple properties,
Size Property: Every internal node has at most four children.
Depth Property: All the external nodes have the same depth

128
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Insertion in (2-4) Tree:

Analysis of Insertion in a (2,4) Tree:


Because dmax is at most 4, the original search for the placement of new key k uses
O(1) time at each level, and thus O(logn) time overall, since the height of the tree is O(logn).
The modifications to a single node to insert a new key and child can be implemented to run in
O(1) time, as can a single split operation. The number of cascading split operations is bounded
by the height of the tree, and so that phase of the insertion process also runs in O(log n) time.
Therefore, the total time to perform an insertion in a (2,4) tree is O(logn).

129
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Deletion in (2-4) Tree:

Performance of (2,4) Trees:


The asymptotic performance of a (2,4) tree is identical to that of an AVL tree in terms
of the sorted map ADT, with guaranteed logarithmic bounds for most operations. The time
complexity analysis for a (2,4) tree having n key value pairs is based on the following:
 The height of a (2,4) tree storing n entries is O(logn).
 A split, transfer, or fusion operation takes O(1) time.
 A search, insertion, or removal of an entry visits O(logn) nodes. Thus, (2,4)
trees provide for fast map search and update operations.

130
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

2 Mark Question with answers


1. What is tree?
A tree is an abstract data type that stores elements hierarchically. With the exception
of the top element, each element in a tree has a parent element and zero or more children
elements.

2. What is sibling?
Two nodes that are children of the same parent are siblings.

3. Define Binary Tree?


Binary tree is a finite set of elements that is either empty or is partitioned into three
disjoint subsets- The root, left sub-tree and right sub-tree.

4. What is a leaf node?


The nodes that do not have any sons is called leaf node.

5. Define the term ancestor?


Node n1 is said to be the ancestor of node n2, if n1 is either the father of n2 or father
of some ancestor of n2.

131
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

6. Give the array representation of the given binary tree?

7. What are the Applications of Tree Traversals?


 Table of Contents
 Parenthetic Representations of a Tree
 Computing Disk Space

8. Define the term descendent?


Node n2 is said to be the descendant of node n1, if n2 is either the left or right son of
node n1 or son of some descendant of n1.

9. Define the term left descendant?


A node n2 is the left descendant of node n1, if n2 is either the left son of n1 or a
descendant of the left son of n1.

10. Define the term right descendant?


A node n2 is the right descendant of node n1, if n2 is either the right son of n1 or a
descendant of the right son of n1.

11. Define expression tree.


Expression tree is a binary tree to represent the structure of an arithmetic expression.
The leaves of an expression tree are operands such as constants or variable names and the
other nodes contain operators.

Tree representing the arithmetic expression: A * (B − C) + (D + E)

132
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

12. What is a strictly binary tree?


The binary tree, in which every non-leaf node has nonempty left and right sub-trees,
is called a strictly binary tree.

13. Define level of a binary tree?


The level of a binary tree is defined as, the root of the tree has level 0, and the level of
any other node in the tree is one more than the level of its father.

14. Define Depth of a binary tree?


The depth of a binary tree is the maximum level of any leaf in the tree. This is same as
the length of the longest path from the root to any leaf.

15. What is a strictly binary tree?


A complete binary tree of depth d is the strictly binary tree all of whose leaves are at
level d.

133
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

16. Write the in-order, pre-order, post-order and Breadth-First or Level Order Traversal
for the given tree.
Inorder 3 7 8 6 11 2 5 4 9
Preorder 2 7 3 6 8 11 5 9 4
Postorder : 3 8 11 6 7 4 9 5 2

17. What is Full Binary Tree?


For a full binary tree, every node has either 2 children or 0 children.

18. What is Degree of a node in a tree? What is the Degree of and B for the given tree?

19. What is an Internal/External node?


Leaf nodes are external nodes and non-leaf nodes are internal nodes.

20. What is Height of a node in a tree? What is the height of E in the given node?

134
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

 If ‘p’ is a leaf, then the height of p is 0.


 Otherwise, the height of p is one more than the maximum of the heights of p’s
children.
 The height of a nonempty tree T is the height of the root of T.

21. What is Perfect Binary Tree?


A Binary tree is Perfect Binary Tree in which all internal nodes have two children and
all leaves are at same level.

22. Draw a tree for the given data in the array?

Binary Tree

23. What are the application of the complete binary tree?


 Heap Sort
 Heap sort based data structure

24. What are the properties of complete binary tree?

Properties of Complete Binary Tree:


 In a complete binary tree number of nodes at depth d is 2d.
 In a complete binary tree with n nodes height of the tree is log (n+1).
 All the levels except the last level are completely full.

25. What are the basic operations on a binary tree?

135
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Let p be a pointer to a node and x be the information. Now, the basic operations are:
i) info(p)
ii) father(p) / parent(p)
iii) left(p)
iv) right(p)
v) brother(p)
vi) isleft(p)
vii) isright(p)

26. What is the length of the path in a tree?


The length of the path is the number of edges on the path. In a tree there is exactly
one path from the root to each node.

- The Length of the path A-B-E-J is 3.

- The length of the path C-G-K is 2.

27. What are the applications of binary tree?


Binary tree is used in data processing.
a. File index schemes
b. Hierarchical database management system

28. What is meant by traversing?


Traversing a tree means processing it in such a way, that each node is visited only
once.

29. What are the different types of traversing?


a. Pre-order traversal. (Root-Left-Right)
b. In-order traversal (Left-Root-Right)
c. Post-order traversal (Left-Right-Root)

30. What are the two methods of binary tree implementation?


a. Linear representation.
b. Linked representation

31. Define pre-order traversal?


a. Visit the root node
b. Traverse the left sub-tree
c. Traverse the right sub-tree

136
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

32. What is a binary search tree?


A binary tree in which all the elements in the left sub-tree of a node n are less than the
contents of n, and all the elements in the right sub-tree of n are greater than or equal to the
contents of n is called a binary search tree.

33. How can you say recursive procedure is efficient than non-recursive?
There is no extra recursion. The automatic stacking and unstacking make it more
efficient. There are no extraneous parameters and local variables used.

34. Define AVL tree?


AVL tree also called as height balanced tree. It is a height balanced tree in which every
node will have a balancing factor of –1,0,1 Balancing factor Balancing factor of a node is given
by the difference between the height of the left sub tree and the height of the right sub tree.

35. Why AVL Trees?


Most of the BST operations (e.g., search, max, min, insert, delete.. etc) take O(h) time
where h is the height of the BST. The cost of these operations may become O(n) for a skewed
Binary tree. If we make sure that height of the tree remains O(Logn) after every insertion and
deletion, then we can guarantee an upper bound of O(Logn) for all these operations. The
height of an AVL tree is always O(Logn) where n is the number of nodes in the tree.

36. What is heap?


A heap is a tree-based data structure in which all the nodes of the tree are in a specific
order. There are two types of heap. 1) Min Heap 2) Max Heap

37. What are Max Heap and Min Heap?


In a Max-Heap the key present at the root node must be greatest among the keys
present at all of it‟s children. The same property must be recursively true for all sub-trees in
that Binary Tree.
In a Min-Heap the key present at the root node must be minimum among the keys
present at all of it’s children. The same property must be recursively true for all sub-trees in
that Binary Tree.

137
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

38. What are the applications of Heap?


 Heap Implemented priority queues are used in Graph algorithms like Prim‟s Algorithm
and Dijkstra‟s algorithm.
 Order statistics: The Heap data structure can be used to efficiently find the kth smallest
(or largest) element in an array.
 Priority Queues: Priority queues can be efficiently implemented using Binary Heap
because it supports insert(), delete() and extractmax(), decreaseKey() operations in
O(logn) time.

39. In a binary max heap containing n numbers, the smallest element can be found in
time.
Time complexity : O(n) In a max heap, the smallest element is always present at a
leaf node. So we need to check for all leaf nodes for the minimum value. Worst case
complexity will be O(n).

40. What are the two relational property of heap?


Heap-Order Property: In a heap T, for every position p other than the root, the key
stored at p is greater than or equal to the key stored at p‟s parent.
Complete Binary Tree Property: A heap T with height h is a complete binary tree if
levels 0,1,2, . . . ,h−1 of T have the maximum number of nodes possible (namely, level i has
2i nodes, for 0 ≤ i ≤ h−1) and the remaining nodes at level h reside in the leftmost possible
positions at that level.

41. What is multi -way search tree?


Multi-way trees or m-Way tree are generalised versions of binary trees where each
node contains multiple elements. In an m-Way tree of order m, each node contains a maximum
of m – 1 elements and m children.

42. What are the operations of Heap?


heapify(iterable) :- This function is used to convert the iterable into a heap data
structure. i.e. in heap order.
heappush(heap, ele) :- This function is used to insert the element mentioned in its
arguments into heap. The order is adjusted, so as heap structure is maintained.
heappop(heap) :- This function is used to remove and return the smallest element
from heap. The order is adjusted, so as heap structure is maintain.

138
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

43. hat is heapify?


Heapify is the process of creating a heap data structure from a binary tree. It is used
to create a Min-Heap or a Max-Heap.

Max Heapify

139
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

UNIT V - GRAPH STRUCTURES


Graph ADT – representations of graph – graph traversals – DAG – topological
ordering – shortest paths – minimum spanning trees.

Graphs:
A graph is a way of representing relationships that exist between pairs of objects.
That is, a graph is a set of objects, called vertices, together with a collection of pairwise
connections between them, called edges. It can also be represented as G=(V, E).

Graphs have applications in modelling many domains, including mapping, transportation,


computer networks, and electrical engineering.
A graph G is simply a set V of vertices and a collection E of pairs of vertices from V,
called edges. Thus, a graph is a way of representing connections or relationships between
pairs of objects from some set V.

Graph Data Structure Terminologies


Mathematical graphs can be represented in data structure. We can represent a graph using
an array of vertices and a two-dimensional array of edges. Before we proceed further, let's
familiarize ourselves with some important terms –

Vertex − Each node of the graph is represented as a vertex. In the following example, the
labelled circle represents vertices. Thus, A to E are vertices.
Edge − Edge represents a path between two vertices or a line between two vertices. In the
following example, the lines from A to B, B to E, and so on represents edges.
Adjacency − Two node or vertices are adjacent if they are connected to each other through
an edge. In the following example, B is adjacent to A, D is adjacent to B, and so on.

Path − Path represents a sequence of edges between the two vertices. In the following
example,

140
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

ABDE represents a path from A to D.

Directed path - is a path such that all edges are directed and are traversed along their
direction.

Length - The no of edges in a path is called as length of the path in a graph. For example,
the length of the path (A,D) in a above graph is 2 because it contains two edges there are
(A,B),(B,D).

Degree - The number of edges incident on a vertex in a graph is called as degree. It is


classified in to two types.
 Indegree
 Outdegree
o Indegree
The number of incoming edges to a vertex is called as indegree
Indegree of A = 0 Indegree of B = 2
Indegree of C = 1 Indegree of D = 3
o Outdegree
The number of outgoing edges from a vertex is called as outdegree
Outdegree of A=3 Outdegree of B=1
Outdegree of C=1 Outdegree of D=0
Reachable - Given vertices u and v of a (directed) graph G, we say that u reaches v, and
that v is reachable from u, if G has a (directed) path from u to v.
strongly connected - A directed graph G is strongly connected if, for any two vertices u
and v of G, u reaches v and v reaches u.
An undirected graph is connected, if there is a path from every vertex to every other
vertex. A directed graph with this property is called strongly connected.

141
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

subgraph - A subgraph of a graph G is a graph H whose vertices and edges are subsets of
the vertices and edges of G, respectively.

Tree - A tree is a connected forest, that is, a connected graph without cycles.

Types of Graphs
1.Directed Graph
Directed graph is a graph in which edges are directed. Here each edge is unidirectional. In
directed graph the edges (A,C) is not same as (C,A). It is also called as digraph.

2. Undirected Graph
Undirected graph is a graph in which edges are undirected. Here each edge is Bidirectional.
In undirected graph, (A,C) = (C,A )

3. Weighted Graph
Weighted graph is a graph in which edges are assigned by some a weight or value.
This value is considered as cost/distance of traversing from one vertex to another vertex.
Weighted graph can be either directed or undirected.

142
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

4. Complete Graph
Complete graph is a graph in which there is an edge between each pair of vertices.
Here there is a path from each vertex to every other vertex. A complete graph with n vertices
should have n(n-1)/2 edges.

5. Cyclic Graph
Cyclic graph is a graph which has cycles. Cycle is a path which starts and ends at
same vertex.

6. Acyclic Graph
Acyclic graph is a graph in which does not have cycles in it. It is also called as Directed
Acyclic Graph(DAG).

Representation of Graphs or Data Structures for Graphs


The four commonly used representations of Graphs are:
1. Edge List - maintains an unordered list of all edges, but there is no efficient way to
locate a particular edge (u,v), or the set of all edges incident to a vertex v.
2. Adjacency List - for each vertex, a separate list containing those edges that are
incident to the vertex. The complete set of edges can be determined by taking the
union of the smaller sets.
3. Adjacency Map - is very similar to an adjacency list, but the secondary container
of all edges incident to a vertex is organized as a map, rather than as a list, with
the adjacent vertex serving as a key. This allows for access to a specific edge (u,v)
in O(1) expected time.
4. Adjacency Matrix - provides worst-case O(1) access to a specific edge (u,v) by
maintaining an n × n matrix, for a graph with n vertices. Each entry is dedicated to
storing a reference to the edge (u,v) for a particular pair of vertices u and v; if no
such edge exists, the entry will be None.

143
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

1. Edge List:
It maintains an unordered list of all edges, but there is no efficient way to locate a
particular edge (u,v), or the set of all edges incident to a vertex v.

Performance of the Edge List Structure:


The performance of an edge list structure in fulfilling the graph ADT is O(n + m) for
representing a graph with n vertices and m edges. Each individual vertex or edge instance
uses O(1) space, and the additional lists V and E use space proportional to their number of
entries.

2. Adjacency List Structure:


The adjacency list structure groups the edges of a graph by storing them in smaller,
secondary containers that are associated with each individual vertex. Specifically, for each
vertex v, we maintain a collection I(v), called the incidence collection of v, whose entries are
edges incident to v.

144
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Performance of the Adjacency List Structure:


 O(n + m) for representing a graph with n vertices and m edges.
 Each individual vertex or edge instance uses O(1) space.
 Vertex count and edge count methods run in O(1) time.
 Methods vertices and edges run respectively in O(n) and O(m) time.

3. Adjacency Map Structure:


It is very similar to an adjacency list, but the secondary container of all edges incident
to a vertex is organized as a map, rather than as a list, with the adjacent vertex serving as a
key. This allows for access to a specific edge (u,v) in O(1) expected time.

Performance of Adjacency Map Structure:


 Space usage for an adjacency map remains O(n+ m).
 Edge(u,v) method can be implemented in expected O(1) time.
 Worst-case bound retains O(min(deg(u),deg(v))).

145
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

4. Adjacency Matrix Structure:


It provides worst-case O(1) access to a specific edge (u,v) by maintaining an n × n matrix,
for a graph with n vertices. Each entry is dedicated to storing a reference to the edge (u,v)
for a particular pair of vertices u and v if no such edge exists, the entry will be None.

Performance of Adjacency Matrix Structure:


 Any edge (u,v) can be accessed in worst-case O(1) time.
 Adjacency list or map can locate those edges in optimal O(deg(v)) time.
 Adding or removing vertices from a graph is problematic, as the matrix must be
resized.
 O(n2) space usage of an adjacency matrix is typically far worse than the O(n +
m).

146
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Graph Traversals:
A traversal is a systematic procedure for exploring a graph by examining all of its
vertices and edges. A traversal is efficient if it visits all the vertices and edges in time
proportional to their number, that is, in linear time.
Graph traversal shows the notion of reachability. Reachability in an undirected graph
G include the following:
 Computing a path from vertex u to vertex v, or reporting that no such path
exists.
 Given a start vertex s of G, computing, for every vertex v of G, a path with the
minimum number of edges between s and v, or reporting that no such path
exists.
 Testing whether G is connected.
 Computing a spanning tree of G, if G is connected.
 Computing a cycle in G, or reporting that G has no cycles.

Reachability in a directed graph G include the following:


 Computing a directed path from vertex u to vertex v, or reporting that no such
path exists.
 Finding all the vertices of G that are reachable from a given vertex s.
 Determine whether G is acyclic.
 Determine whether G is strongly connected.
To reach all the nodes of a graph we need Graph Traversal Techniques. There are two
types of Graph Traversal methods,
1. Depth First Search
2. Breadth First Search

Depth First Search:


Depth-first search is useful for testing 1) Whether there is a path from one vertex to
another and 2) Whether or not a graph is connected.
Procedure:
Step 1: Visit adjacent unvisited vertex. Mark it visited. Display it. Push it in a stack.
Step 2: If no adjacent vertex found, pop up a vertex from stack. (It will pop up all the vertices
from the stack which do not have adjacent vertices.)
Step 3: Repeat from Step 1 until stack is empty.

147
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Algorithm DFS(G,u): {We assume u has already been marked as visited}


Input: A graph G and a vertex u of G
Output: A collection of vertices reachable from u, with their discovery edges
for each outgoing edge e = (u,v) of u do
if vertex v has not been visited then
Mark vertex v as visited (via edge e).
Recursively call DFS(G,v).

Example:

148
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Running Time of Depth-First Search:


 incident edges(v) takes O(deg(v)) time.
 e.opposite(v) method takes O(1) time.
 edge has been explored in O(1) time.
Python Code:
def dfs(visited, graph, node):
if node not in visited:
print (node)
visited.add(node)
for neighbour in graph[node]:
dfs(visited, graph, neighbour)

Breadth-First Search
Traversing a connected component of a graph, known as a breadth-first search
(BFS)
Procedure:
A BFS proceeds in rounds and subdivides the vertices into levels.
 BFS starts at vertex s, which is at level 0.
 In the first round, we paint as “visited,” all vertices adjacent to the start vertex s-
these vertices are one step away from the beginning and are placed into level 1.
 In the second round, we allow all explorers to go two steps (i.e., edges) away from
the starting vertex. These new vertices, which are adjacent to level 1 vertices and not
previously assigned to a level, are placed into level 2 and marked as “visited.”
 This process continues in similar fashion, terminating when no new vertices are
found in a level.
Python Code:
def bfs(visited, graph, node):
visited.append(node)
queue.append(node)
while queue:
s = queue.pop(0)
print (s, end = " ")
for neighbour in graph[s]:
if neighbour not in visited:
visited.append(neighbour)
queue.append(neighbour)
bfs(visited, graph, 'A')

Example:

149
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Running time of Breadth First Search:


 A BFS traversal of G takes O(n+m) time.

150
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Directed Acyclic Graphs:


Directed Graphs without cycles are referred as Directed Acyclic Graphs – DAG.
Topological Ordering:
A topological ordering is an ordering such that any directed path in G traverses
vertices in increasing order. Note that a directed graph may have more than one topological
ordering.
Python Code
def Topsort(Graph G):
int Counter;
Vertex V, W;
for counter in range(0,NumVertex)
V=FindNewVertexOfDegreeZero()
if(V==NotAVertex):
Error(“Graph has a cycle)
break
TopNum[V]= Counter
for W in range(o, adjacent to V):
Indegree[W]--;
Example:

151
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Ordering of vertices using Topological Sorting:

A B C D E F G H

A 0 0 0 0 0 0 0 0

B 0 0 0 0 0 0 0 0

C 1 0 0 0 0 0 0 0

D 3 2 1 1 0 0 0 0

E 1 1 0 0 0 0 0 0

F 2 2 2 2 1 0 0 0

G 2 2 2 1 1 1 0 0

H 3 3 2 2 2 2 1 0

A C E B D F G H

ORDERED VERTICES ARE : A C E B D F G H

152
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Greedy Algorithm
An algorithm is designed to achieve optimum solution for a given problem . Greedy
algorithms try to find a localized optimum solution(i.e., a problem is solved by selecting the
best option available at the moment), which may eventually lead to globally optimized
solutions. It doesn't worry whether the current best result will bring the overall optimal result.
The algorithm never reverses the earlier decision even if the choice is wrong. It
works in a top-down approach.
This algorithm may not produce the best result(globally optimized solution) for all the
problems. It's because it always goes for the local best choice to produce the global best
result.
However, we can determine if the algorithm can be used with any problem if the
problem has the following properties:
1. Greedy Choice Property
If an optimal solution to the problem can be found by choosing the best choice at
each step without reconsidering the previous steps once chosen, the problem can be solved
using a greedy approach. This property is called greedy choice property.
2. Optimal Substructure
If the optimal overall solution to the problem corresponds to the optimal solution to its
subproblems, then the problem can be solved using a greedy approach. This property is
called optimal substructure.
Advantages of Greedy Approach
 The algorithm is easier to describe.
 This algorithm can perform better than other algorithms (but, not in all cases).
Drawback of Greedy Approach
As mentioned earlier, the greedy algorithm doesn't always produce the optimal
solution. This is the major disadvantage of the algorithm
For example, suppose we want to find the longest path in the graph below from root
to leaf. Let's use the greedy algorithm here.

Apply greedy approach to this tree to find the longest route

Greedy Approach

1. Let's start with the root node 20. The weight of the right child is 3 and the weight of the left child is 2.
2. Our problem is to find the largest path. And, the optimal solution at the moment is 3. So, the greedy
algorithm will choose 3.
3. Finally the weight of an only child of 3 is 1. This gives us our final result 20 + 3 + 1 = 24.
153
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

However, it is not the optimal solution. There is another path that carries more weight (20 + 2 + 10 = 32)
as shown in the image below.

Longest path
Therefore, greedy algorithms do not always give an optimal/feasible solution.

Greedy Algorithm
1. To begin with, the solution set (containing answers) is empty.
2. At each step, an item is added to the solution set until a solution is reached.
3. If the solution set is feasible, the current item is kept.
4. Else, the item is rejected and never considered again.

Counting Coins
This problem is to count to a desired value by choosing the least possible coins and
the greedy approach forces the algorithm to pick the largest possible coin. If we are
provided coins of ₹ 1, 2, 5 and 10 and we are asked to count ₹ 18 then the greedy
procedure will be −
 1 − Select one ₹ 10 coin, the remaining count is 8
 2 − Then select one ₹ 5 coin, the remaining count is 3
 3 − Then select one ₹ 2 coin, the remaining count is 1
 4 − And finally, the selection of one ₹ 1 coins solves the problem
Though, it seems to be working fine, for this count we need to pick only 4 coins. But
if we slightly change the problem then the same approach may not be able to produce the
same optimum result.
For the currency system, where we have coins of 1, 7, 10 value, counting coins for
value 18 will be absolutely optimum but for count like 15, it may use more coins than
necessary. For example, the greedy approach will use 10 + 1 + 1 + 1 + 1 + 1, total 6 coins.
Whereas the same problem could be solved by using only 3 coins (7 + 7 + 1)
Hence, we may conclude that the greedy approach picks an immediate optimized
solution and may fail where global optimization is a major concern.
Examples
Most networking algorithms use the greedy approach. Here is a list of few of them −
 Travelling Salesman Problem
 Prim's Minimal Spanning Tree Algorithm
 Kruskal's Minimal Spanning Tree Algorithm
 Dijkstra's Minimal Spanning Tree Algorithm
 Graph - Map Coloring
 Graph - Vertex Cover

154
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Dynamic programming
Dynamic programming approach is similar to divide and conquer in breaking down
the problem into smaller and yet smaller possible sub-problems. But unlike, divide and
conquer, these sub-problems are not solved independently. Rather, results of these smaller
sub-problems are remembered and used for similar or overlapping sub-problems.
Dynamic programming is used where we have problems, which can be divided into
similar sub-problems, so that their results can be re-used. Mostly, these algorithms are used
for optimization. Before solving the in-hand sub-problem, dynamic algorithm will try to
examine the results of the previously solved sub-problems. The solutions of sub-problems
are combined in order to achieve the best solution.
So we can say that −
 The problem should be able to be divided into smaller overlapping sub-problem.
 An optimum solution can be achieved by using an optimum solution of smaller sub-
problems.
 Dynamic algorithms use Memoization.
Comparison
In contrast to greedy algorithms, where local optimization is addressed, dynamic
algorithms are motivated for an overall optimization of the problem.
In contrast to divide and conquer algorithms, where solutions are combined to
achieve an overall solution, dynamic algorithms use the output of a smaller sub-problem and
then try to optimize a bigger sub-problem. Dynamic algorithms use Memoization to
remember the output of already solved sub-problems.
Example
The following computer problems can be solved using dynamic programming approach −
 MultiStage Graph
 All pair shortest path by Floyd-Warshall
 Shortest path by Dijkstra
 Single Source Shortest Path by Bellman Ford
Dynamic programming can be used in both top-down and bottom-up manner.

155
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Shortest Paths:
Breadth-first search strategy can be used to find a shortest path from some starting
vertex to every other vertex in a connected graph.

Defining Shortest Paths in a Weighted Graph:


Let G be a weighted graph. The length (or weight) of a path is the sum of the weights
of the edges of P. That is, if P = ((v0,v1),(v1,v2),...,(vk−1,vk)), then the length of P, denoted
w(P), is defined as,

The distance from a vertex u to a vertex v in G, denoted d(u,v), is the length of a


minimum-length path (also called shortest path) from u to v, if such a path exists. People
often use the convention that d(u,v) = ∞ if there is no path at all from u to v in G.
Even if there is a path from u to v in G, however, if there is a cycle in G whose total
weight is negative, the distance from u to v may not be defined.
There is an interesting approach for solving this single-source problem based on the
greedy method design pattern.

Dijkstra’s Algorithm:
Single-source shortest path problem is to perform a “weighted” breadth-first search
starting at the source vertex s.
In each iteration, the next vertex chosen is the vertex outside the cloud that is closest
to s. The algorithm terminates when no more vertices are outside the cloud.
Applying the greedy method to the single-source shortest-path problem, results in an
algorithm known as Dijkstra’s algorithm.

Edge Relaxation:

Procedure:
• Assign the source node as S and Enqueue S.
• Dequeue the vertex S from queue and assign the value of that vertex to be known
and then find its adjacency vertices.
• If the distance of the adjacent vertices is equal to infinity then change the distance of
that vertex as the distance of its source vertex. Increment by 1 and enqueue the
vertex.
• Repeat step ii until the queue becomes empty.

156
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Algorithm ShortestPath(G,s):
#Input: A weighted graph G with nonnegative edge weights, and a #distinguished vertex s of
G.
#Output: The length of a shortest path from s to v for each vertex v of G.
#Initialize D[s] = 0 and D[v] = ∞ for each vertex v = s.
#Let a priority queue Q contain all the vertices of G using the D labels as keys.
while Q is not empty do
u = value returned by Q.remove min() #{pull a new vertex u into the cloud}
for each vertex v adjacent to u such that v is in Q do
if D[u] +w(u,v) < D[v] then #{perform the relaxation procedure on edge (u,v)}
D[v] = D[u] +w(u,v)
Change to D[v] the key of vertex v in Q.
return the label D[v] of each vertex v

Example: Find the Shortest path using Dijkstra’s Algorithm

Solution:

1. v1 is taken as source.

157
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

2. Now v1 is known vertex, marked as 1. Its adjacent vertices are v2, v4, pv and dv values
are updated
T[v2]. dist = Min (T[v2].dist, T[v1].dist + Cv1, v2) = Min (α , 0+2) = 2
T[v4]. dist = Min (T[v4].dist, T[v1].dist + Cv1, v4) = Min (α , 0+1) = 1

3. Select the vertex with minimum distance away v2 and v4. V4 is marked as known vertex.
Its adjacent vertices are v3, v5, v6 and v7 .
T[v3]. dist = Min (T[v3].dist, T[v4].dist + Cv4, v3) = Min (α , 1+2) = 3
T[v5]. dist = Min (T[v5].dist, T[v4].dist + Cv4, v5) = Min (α , 1+2) = 3
T[v6]. dist = Min (T[v6].dist, T[v4].dist + Cv4, v6) = Min (α , 1+8) = 9
T[v7]. dist = Min (T[v7].dist, T[v4].dist + Cv4, v7) = Min (α , 1+4) = 5

4. Select the vertex which is shortest distance from source v1. v2 is smallest one. v2 is
marked as known vertex. Its adjacent vertices are v4 ad v5. The distance from v1 to v4 and
v5 through v2 is more comparing with previous value of dv. No change in dv and pv value.

158
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

5. Select the next smallest vertex from source. v3 and v5 are smallest one. Adjacent vertices
for v3 are v1 and v6. v1 is source there is no change in dv and pv
T[v6]. dist = Min (T[v6].dist, T[v3].dist + Cv3, v6) = Min (9 , 3+5) = 8
dv and pv values are updated. Adjacent vertices for v5 are v7. No change in dv and pv
value.

6. Next smallest vertex v7. Its adjacent vertex is v6.


T[v6]. dist = Min (T[v6].dist, T[v7].dist + Cv7, v6) = Min (8 , 5+1) = 6
dv and pv values are updated.

159
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

7. The last vertex v6 is declared as known. No adjacent vertices for v6. No updation in the

table.

The shortest distance identified from the source V1 are:


V1 V2 is 2
V1 V4 is 1
V1 V6 is 6
V1 V3 is 3
V1 V5 is 3
V1 V7 is 5 Algorithm Analysis Time complexity of this algorithm O(|E| + |V|2 ) =
O(|V|2 )
Minimum Spanning Trees:
A tree, that contains every vertex of a connected graph G is said to be a spanning
tree, and the problem of computing a spanning tree T with smallest total weight is known as
the minimum spanning tree (or MST) problem.
Given an undirected, weighted graph G, finding a tree T that contains all the vertices
in G and minimizes the sum can be represented as,

Procedure:
o We begin with some vertex s,
o defining the initial “cloud” of vertices C.
o Then, in each iteration, choose a minimum-weight edge e = (u,v), connecting
a vertex u in the cloud C to a vertex v outside of C.
o The vertex v is then brought into the cloud C and the process is repeated until
a spanning tree is formed.

160
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Algorithm PrimJarnik(G):
Input: An undirected, weighted, connected graph G with n vertices and m edges
Output: A minimum spanning tree T for G
Pick any vertex s of G
INF = 9999999
V=5
selected = [0, 0, 0, 0, 0]
no_edge = 0
selected[0] = True
print("Edge : Weight\n")
while (no_edge < V - 1):
minimum = INF
x=0
y=0
for i in range(V):
if selected[i]:
for j in range(V):
if ((not selected[j]) and G[i][j]):
# not in selected and there is an edge
if minimum > G[i][j]:
minimum = G[i][j]
x=i
y=j
print(str(x) + "-" + str(y) + ":" + str(G[x][y]))
selected[y] = True
no_edge += 1

161
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

Example

Analyzing the Prim-Jarnik Algorithm:


o Each operation runs in O(logn) time.
o The overall time for the algorithm is O((n + m)logn), which is O(mlogn) for a
connected graph.
o Alternatively, we can achieve O(n2) running time by using an unsorted list as
a priority queue.

162
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

2 Mark Questions with Answers


1. Define Graph.
A graph is a way of representing relationships that exist between pairs of objects.
That is, a graph is a set of objects, called vertices ‘V’, together with a collection of pairwise
connections between them, called edges ‘E’. It can also be represented as G=(V, E).

2. Define adjacent nodes.


Any two nodes which are connected by an edge in a graph are called adjacent
nodes. For example, if an edge x ε E is associated with a pair of nodes (u,v) where u, v ε V,
then we say that the edge x connects the nodes u and v.

3. What is a directed graph?


A graph in which every edge is directed is called a directed graph.

4. What is an undirected graph?


A graph in which every edge is undirected is called a directed graph.

5. What is a loop?
An edge of a graph which connects to itself is called a loop or sling.

163
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

6. What is a simple graph?


A simple graph is a graph, which has not more than one edge between a pair of
nodes than such a graph is called a simple graph.

7. What is a weighted graph?


A graph in which weights are assigned to every edge is called a weighted graph.

8. Define outdegree of a graph?


In a directed graph, for any node v, the number of edges which have v as their initial
node is called the out degree of the node v.

9. Define indegree of a graph?


In a directed graph, for any node v, the number of edges which have v as their
terminal node is called the indegree of the node v.

10. Define path in a graph?


The path in a graph is the route taken to reach terminal node from a starting node.

164
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

11. What is a simple path?


A path in a diagram in which the edges are distinct is called a simple path. It is also
called as edge simple.

12. What is a cycle or a circuit?


A path which originates and ends in the same node is called a cycle or circuit.

13. What is an acyclic graph?


A simple diagram which does not have any cycles is called an acyclic graph.

14. What is meant by strongly connected in a graph?


An undirected graph is connected, if there is a path from every vertex to every other
vertex. A directed graph with this property is called strongly connected.

15. When is a graph said to be weakly connected?


When a directed graph is not strongly connected but the underlying graph is
connected, then the graph is said to be weakly connected.

16. Name the different ways of representing a graph?


1. Edge List - maintains an unordered list of all edges, but there is no efficient way
to locate a particular edge (u,v), or the set of all edges incident to a vertex v.
2. Adjacency List - for each vertex, a separate list containing those edges that are
incident to the vertex. The complete set of edges can be determined by taking the
union of the smaller sets.

165
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

3. Adjacency Map - is very similar to an adjacency list, but the secondary container
of all edges incident to a vertex is organized as a map.
4. Adjacency Matrix - Each entry is dedicated to storing a reference to the edge
(u,v) for a particular pair of vertices u and v; if no such edge exists, the entry will
be None.

17. What is an undirected acyclic graph?


When every edge in an acyclic graph is undirected, it is called an undirected acyclic
graph. It is also called as undirected forest.

18. What is a minimum spanning tree?


A minimum spanning tree of an undirected graph G is a tree formed from graph
edges that connects all the vertices of G at the lowest total cost.

19. Name two algorithms two find minimum spanning tree.


 Kruskal’salgorithm
 Prim’s algorithm

20. Define graph traversals.


Traversing a graph is an efficient way to visit each vertex and edge exactly once.
There are the two traversal strategies used in traversing a graph
a. Breadth first search
b. Depth first search

21. List the two important key points of depth first search.
i) If path exists from one node to another node, walk across the edge – exploring the edge.
ii) If path does not exist from one specific node to any other node, return to the previous
node where we have been before – backtracking.

22. What do you mean by breadth first search (BFS)?


BFS performs simultaneous explorations starting from a common point and
spreading out independently.

166
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

23. Differentiate BFS and DFS.


SNo Concept BFS DFS
BFS stands for Breadth First
1. Stands for DFS stands for Depth First Search.
Search.
Approach It works on the concept of FIFO It works on the concept of LIFO
2.
used (First In First Out). (Last In First Out).
BFS is more suitable for searching
DFS is more suitable when there are
3. Suitable for vertices which are closer to the
solutions away from source.
given source.
The Time complexity of BFS is O(V The Time complexity of DFS is also
Time + E) when Adjacency List is used O(V + E) when Adjacency List is
4.
Complexity and O(V^2) when Adjacency Matrix used and O(V^2) when Adjacency
is used. Matrix is used.
Visiting of
Here, siblings are visited before the Here, children are visited before the
5. Siblings/
children. siblings.
Children
BFS is used in various application DFS is used in various application
6. Applications such as bipartite graph, and such as acyclic graph and
shortest path etc. topological order etc.
7. Memory BFS requires more memory. DFS requires less memory.

8. Speed BFS is slow as compared to DFS. DFS is fast as compared to BFS.

24. What do you mean by tree edge?


If w is undiscovered at the time vw is explored, then vw is called a tree edge and v
becomes the parent of w.
It is an edge which is present in the tree obtained after applying DFS on the graph.
All the Green edges are tree edges.

25. What do you mean by back edge?


If w is the ancestor of v, then v is called a back edge.

167
Downloaded by s h s g (pgpcet2022@gmail.com)
lOMoARcPSD|42733091

26. Define biconnectivity.


A connected graph G is said to be biconnected, if it remains connected after removal
of any one vertex and the edges that are incident upon that vertex. A connected graph is
biconnected, if it has no articulation points.

27. What do you mean by articulation point?


If a graph is not biconnected, the vertices whose removal would disconnect the graph
are known as articulation points.

28. What do you mean by shortest path?


A path having minimum weight between two vertices is known as shortest path, in
which weight is always a positive number.

29. Define Activity node graph.


Activity node graphs represent a set of activities and scheduling constraints. Each
node represents an activity (task), and an edge represents the next activity.

30. Define adjacency list.


Adjacency list is an array indexed by vertex number containing linked lists. Each
node Vi the ith array entry contains a list with information on all edges of G that leave Vi. It is
used to represent the graph related problems.

168
Downloaded by s h s g (pgpcet2022@gmail.com)

You might also like