Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

15Cs201J-Data Structures: Unit-I

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 127

15CS201J-DATA STRUCTURES

UNIT-I
INTRODUCTION
TO
DATA STRUCTURES

DEPARTMENT OF SOFTWARE ENGINEERING


Assignment Method:
Cycle Test I - 10%
Cycle Test II - 15%
Cycle Test III - 15%
Surprise Test - 5%
Assignment - 5%
---------------------------------
Total - 50%

DEPARTMENT OF SOFTWARE ENGINEERING


Text Books
1. Seymour Lipschutz, “ Data Structures with
C”,McGraw Hill Education, Special Indian
Edition, 2014.
2. Mark Allen Weiss, “Data Structures and
Algorithm Analysis in C”, 2nd Edition,
Pearson Education, 2011.

DEPARTMENT OF SOFTWARE ENGINEERING


Outline
• Introduction
• Basic terminology
• Data structures–operations
• ADT– Algorithms:
• Complexity, Time – Space trade off
• Mathematical notations and functions
• Asymptotic notations
• Linear and Binary search
• Bubble sort -Insertion sort
DEPARTMENT OF SOFTWARE ENGINEERING
Elementary Data Organization

• Data are simply values or sets of values.


• Collection of data are frequently organized into a
hierarchy of fields, records and files.
• This organization of data may not complex enough to
maintain and efficiently process certain collections of
data.
• For this reason, data are organized into more complex
type of structures called Data Structures.
DATA vs INFORMATION
DATA INFORMATION
• Raw Fact – about anything • Processed data
• Computers need data • Human needs information
• Data doesn’t depend on • Information depends on
Information data.
• Input to any system may be • Output after processing the
treated as Data data given to the system is
Information.
• Data – Raw Material • Information – Product
• Data may not be in the order. • Information should be in the
order.

DEPARTMENT OF SOFTWARE ENGINEERING


Information
• Data + Meaning
• Processed data (Making data meaningful and
useful)

• Subset of Data
• Manipulated raw data

DEPARTMENT OF SOFTWARE ENGINEERING


Example
Data Information
• Ingredients • Recipe
• 01/01, or 01012015, or • New Year’s Day, First day
20150101 – Date as data in a year
• Each student's test score is • Performance of a student,
one piece of data. Average score of a class,
Performance of a school -
information, derived from
the given data.

DEPARTMENT OF SOFTWARE ENGINEERING


Processing of data
• Collection of Data

• Organizing in a structure

• Manipulation of data for need

DEPARTMENT OF SOFTWARE ENGINEERING


Data Structure
 Organizing Data in computer memory - To use
efficiently (Algorithm Efficiency)
 Way of collecting and organising data in such a
way that we can perform operations on these data
in an effective way
 Anything that can store data
 Rendering data elements in terms of some
relationship, for better organization and storage
 Store ordered data
DEPARTMENT OF SOFTWARE ENGINEERING
Data Structure

Primitive Data Structure Non – Primitive/User Defined


Data Structure

Integer Float Character Boolean Linear Data Non - Linear


Structure Data Structure

Arrays Stack Queues Linked List Trees Graphs

DEPARTMENT OF SOFTWARE ENGINEERING


Linear Data Structure
 If the elements organized in the data structure form a
sequence
 Data element relationship/Arrangement in linear fashion
 Representation of linear data structures in memory – (Two
Ways)
1. Elements in Consecutive/sequential
memory locations - Arrays
2. Elements with link to other element – Linked List
 Other Linear Data Structures
Stacks, Queues
DEPARTMENT OF SOFTWARE ENGINEERING
Non – Linear Data Structure
• Data's are not arranged in sequence
• Example:
Trees and Graphs

DEPARTMENT OF SOFTWARE ENGINEERING


Static Data Structure
 Memory size is fixed
 Size of the Data Structure – Predefined
 Memory Allocated at compile time
Advantage:
No need to check/Keep track of size of the data
structure
Disadvantage:
Memory can’t be utilized efficiently as it is assigned even
after deletion of values

DEPARTMENT OF SOFTWARE ENGINEERING


Dynamic Data Structure
 Memory is allocated to the data structure as the program
executes – Dynamically (at run time)
 Used when size of the data structure is not known in
advance
 Memory size – not fixed used as need
Advantage:
Memory is used efficiently, assigned only as
much is needed.
Disadvantage:
Should keep track of memory utilized
DEPARTMENT OF SOFTWARE ENGINEERING
Data Structure Operations

1. Traversing: Accessing each record exactly once so that certain items in the

record may be processed.

2. Searching: Finding the location of the record with a given key value.

3. Inserting: Adding a new record to the structure.

4. Deleting: Removing a record from the structure.

5. Sorting: Arranging the records in some logical order.

6. Merging: Combing the records in two different sorted files into a single

sorted file.DEPARTMENT OF SOFTWARE ENGINEERING


Algorithm

Problem

Algorithm

Input “Computer” Output

DEPARTMENT OF SOFTWARE ENGINEERING


Algorithm
An Algorithm is a sequence of unambiguous
instructions for solving a problem,
i.e., for obtaining a required output for any
legitimate input in a finite amount of time.

DEPARTMENT OF SOFTWARE ENGINEERING


Properties of Algorithm

1. Finiteness
Terminates after a finite number of steps
2. Definiteness
Clear, rigorously (Exact and Accurate) and unambiguously
specified
3. Input
valid inputs are clearly specified
4. Clearly specified/expected output
can be proved to produce the correct output given a valid input
5. Effectiveness
Steps are sufficiently simple and basic

DEPARTMENT OF SOFTWARE ENGINEERING


Types of Algorithm
1) Iterative
A()
{
for i=1 to n
max(a,b)
}
2) Recursive
A(n)
{
if( )
A(n/2)
}
DEPARTMENT OF SOFTWARE ENGINEERING
Algorithmic Efficiency Function

DEPARTMENT OF SOFTWARE ENGINEERING


Basic Efficiency Classes
Class Name Comments
1 constant May be in best cases
lgn logarithmic Halving problem size at each
iteration
n linear Scan a list of size n
n×lgn linearithmic Divide and conquer
algorithms, e.g., mergesort

n2 quadratic Two embedded loops, e.g.,


selection sort
n3 cubic Three embedded loops, e.g.,
matrix multiplication

2n exponential All subsets of n-elements


set
n! factorial All permutations of an n-
elements set
Measure running time in terms of # of
basic operations
• Basic operation: the operation that
contributes the most to the total running time
of an algorithm
• Usually the most time consuming operation in
the algorithm’s innermost loop
Input size and basic operation
examples
Problem Measure of input Basic operation
size

Search for a key in a # of items in the list Key comparison


list of n items

Add two n×n matrices Dimensions of the Addition


matrices, n

Polynomial evaluation Order of the Multiplication


polynomial
Asymptotic Notations
• 3 notations used to compare orders of
growth of an algorithm’s basic operation
count
– O(g(n)): Set of functions that grow no faster
than g(n)-Worst Case
– Ω(g(n)): Set of functions that grow at least as
fast as g(n)-Best Case
– Θ(g(n)): Set of functions that grow at the same
rate as g(n)-Average Case

DEPARTMENT OF SOFTWARE ENGINEERING


O-Notation (contd.)
• Definition: A function t(n) is said to be in
O(g(n)), denoted t(n) є O(g(n)), if t(n) is
bounded above by some positive constant
multiple of g(n) for sufficiently large n.
• If we can find +ve constants c and n0 such
that:
t(n) ≤ c × g(n) for all n ≥ n0

DEPARTMENT OF SOFTWARE ENGINEERING


O(big oh)-Notation

c × g(n)
t(n)

Doesn’t
matter

n
n0

t(n) є O(g(n))
DEPARTMENT OF SOFTWARE ENGINEERING
Ω-Notation (contd.)
• Definition: A function t(n) is said to be in
Ω(g(n)) denoted t(n) є Ω(g(n)), if t(n) is
bounded below by some positive constant
multiple of g(n) for all sufficiently large n.
• If we can find +ve constants c and n0 such
that
t(n) ≥ c × g(n) for all n ≥ n0

DEPARTMENT OF SOFTWARE ENGINEERING


Ω(big omega)-Notation

t(n)
c × g(n)

Doesn’t
matter

n
n0

t(n) є Ω(g(n))

DEPARTMENT OF SOFTWARE ENGINEERING


Θ-Notation (contd.)
• Definition: A function t(n) is said to be in
Θ(g(n)) denoted t(n) є Θ(g(n)), if t(n) is
bounded both above and below by some
positive constant multiples of g(n) for all
sufficiently large n.
• If we can find +ve constants c1, c2, and n0
such that
c2×g(n) ≤ t(n) ≤ c1×g(n) for all n ≥ n0

DEPARTMENT OF SOFTWARE ENGINEERING


Θ(big theta)-Notation

c1 × g(n)
t(n)
c2 × g(n)

Doesn’t
matter

n
n0

t(n) є Θ(g(n))

DEPARTMENT OF SOFTWARE ENGINEERING


Complexity of Algorithms

• The complexity of an algorithm M is the function

f(n) which gives the running time and/or storage


space requirement of the algorithm in terms of the
size n of the input data.

• Two types of complexity

1. Time Complexity
2. Space Complexity

DEPARTMENT OF SOFTWARE ENGINEERING


Performance Analysis of Algorithm

An investigation of an algorithm’s efficiency with respect


to two resources:
 Space Complexity
Amount of memory an algorithm needs to perform the
computation
 Time Complexity
Amount of CPU time an algorithm needs to run to
completion.

DEPARTMENT OF SOFTWARE ENGINEERING


Space Complexity
S(P) = c + Sp(instance)

Memory space S(P) needed by a program P, consists of two components:

– A fixed part: needed for instruction space (Space for the code), simple
variable space, constants space etc.  c (Ignore c during calculation)
– A variable part: dependent on a particular instance of input and output
data.  Sp(instance)

 For constant no of variables (constant space complexity) –O(1)

 For an array of n elements ( linear space complexity) – O(n)

DEPARTMENT OF SOFTWARE ENGINEERING


Example-1
1. Algorithm abc (a, b, c)
2. {
3. return a+b+b*c+(a+b-c)/(a+b)+4.0;
4. }
Space required to store variables: a, b, and c.
Sp()= 3. S(P) = 3.

DEPARTMENT OF SOFTWARE ENGINEERING


Example 2
Algorithm Sum(a[], n)
{
s:= 0.0; n=1
for i = 1 to n do a[ ] = n (for n elements)
s := s + a[i]; i=1
return s; s=1
} Sp(n) = (n + 3).
Hence S(P) = (n + 3).

DEPARTMENT OF SOFTWARE ENGINEERING


Time Complexity

• T(P) = c + tp(instance)
• Time required T(P) to run a program P also
consists of two components:
– A fixed part: compile time which is independent of
the problem instance  c. (Ignore c during
calculation)
– A variable part: run time which depends on the
problem instance  tp(instance)

DEPARTMENT OF SOFTWARE ENGINEERING


Theoretical Analysis of Time
Efficiency
• Count the number of times the algorithm’s basic
operation is executed on inputs of size n: C(n)
Input size Ignore cop,
T(n) ≈ cop + C(n) Focus on
orders of
growth

# of times basic op.


Running time
Execution time for is executed
basic operation

DEPARTMENT OF SOFTWARE ENGINEERING


Example-1

Algorithm Sum(a[],n)
{
S = 0.0; 1
for i=1 to n do n
s = s+a[i]; n
return s; 1
}
2n + 2=O(n)
DEPARTMENT OF SOFTWARE ENGINEERING
Example-2

Algorithm Sum(a[],n,m)
{
for i=1 to n do; n
for j=1 to m do nm
s = s+a[i][j]; nm
return s; 1
}
2nm + n + 1
=O(nm)
DEPARTMENT OF SOFTWARE ENGINEERING
Iterative - Example
1) A()
{
int i;
for(i=1 to n)
printf (“abc”);
}
Soln: O(n)

DEPARTMENT OF SOFTWARE ENGINEERING


Iterative - Example
2) A()
{
int I;
for(i=1 to n)
for(j=1 to n)
printf(“abc”);
}
Soln:
nxn= n2
DEPARTMENT OF SOFTWARE ENGINEERING
Iterative - Example
3) A()
{
for(i=1;i^2<=n;i++)
print(“ “);
}
Soln:
n= i2
i=√n

DEPARTMENT OF SOFTWARE ENGINEERING


Iterative - Example
4) A()
{
int i,j,k,n; Soln:
for (i=1;i<=n;i++)=n
{
for (j=1;j<=i^2;j++) =nx n2
{ = n3
for(k=1;k<=n/2;k++) = n/2 x n3
{ = O(n4)
printf(“ “);
}
}
}

DEPARTMENT OF SOFTWARE ENGINEERING


Example
ALGORITHM MaxElement(A[0..n-1])
//Determines largest element
maxval <- A[0]
Input size: n
for i <- 1 to n-1 do Basic operation: > or <-
if A[i] > maxval
maxval <- A[i]
return maxval
C(n) =
єΘ(n)
Example
ALGORITHM UniqueElements(A[0..n-1])
//Determines whether all elements are //distinct
for i <- 0 to n-2 do
for j <- i+1 to n-1 do
if A[i] = A[j]
return false
return true

Input size: n
Basic operation: A[i] = A[j]
Does C(n) depend on type of input?
UniqueElements (contd.)
fori<- 0 to n-2 do
for j <- i+1 to n-1 do
if A[i] = A[j]
return false
return true

Cworst(n) =

Why Cworst(n) is better than


saying Cworst(n)
General Plan for Recursive Algorithms
• Decide on input size parameter
• Identify the basic operation
• Does C(n) depends also on input type?
• Set up a recurrence relation
• Solve the recurrence or, at least establish
the order of growth of its solution
Solving Recurrence Relations

• To solve a recurrence relation T(n) we need to derive a form of


T(n) that is not a recurrence relation. Such a form is called a
closed form of the recurrence relation.
• There are five methods to solve recurrence relations that
represent the running time of recursive methods:
 Iteration method (unrolling and summing)
 Substitution method (Guess the solution and verify by induction)
 Recursion tree method
 Master theorem (Master method)
 Using Generating functions or Characteristic equations
• In this course, we will use the Iteration method
Iteration method
• Steps:
 Expand the recurrence
 Express the expansion as a summation by plugging the recurrence back into itself until you
see a pattern.  
 Evaluate the summation
• In evaluating the summation one or more of the following summation formulae may be used:
• Arithmetic series:
• Special Cases of Geometric Series:
 Geometric Series:
Solving Recurrence Relations - Iteration method (Cont’d)

 Harmonic Series:

 Others:
Example-1
Factorial
ALGORITHM F(n)
// Output: n! Input size: n
Basic operation: ×
if n = 0
return 1 M(n) = M(n-1) + 1 for n > 0

else
to compute
return F(n-1)xn F(n-1) to multiply
n and F(n-1)
Analysis Of Recursive Factorial method
 Example1: Form and solve the recurrence relation for the running time
of factorial method and hence determine its big-O complexity:
long factorial (int n) {
if (n == 0)
return 1;
else
return n * factorial (n – 1);
}
T(0) = c (1)
T(n) = b + T(n - 1) (2)
= b + b + T(n - 2) by subtituting T(n – 1) in (2)
= b +b +b + T(n - 3) by substituting T(n – 2) in (2)

= kb + T(n - k)
The base case is reached when n – k = 0  k = n, we then have:
T(n) = nb + T(n - n)
= bn + T(0)
= bn + c
Therefore the method factorial is O(n)

53
Analysis Of Recursive Binary Search

public int binarySearch (int target, int[] array,


int low, int high) {
if (low > high)
return -1;
else {
int middle = (low + high)/2;
if (array[middle] == target)
return middle;
else if(array[middle] < target)
return binarySearch(target, array, middle + 1, high);
else
return binarySearch(target, array, low, middle - 1);
}
}

• The recurrence relation for the running time of the method is:
T(1) = a if n = 1 (one element array)
T(n) = T(n / 2) + b if n > 1
Analysis Of Recursive Binary Search (Cont’d)
Without loss of generality, assume n, the problem size, is a multiple of 2, i.e., n = 2k
Expanding:
T(1) = a (1)
T(n) = T(n / 2) + b (2)
= [T(n / 22) + b] + b = T (n / 22) + 2b by substituting T(n/2) in (2)
= [T(n / 23) + b] + 2b = T(n / 23) + 3b by substituting T(n/22) in (2)
= ……..
= T( n / 2k) + kb

The base case is reached when n / 2k = 1  n = 2k  k = log2 n, we then


have:

T(n) = T(1) + b log2 n


= a + b log2 n

Therefore, Recursive Binary Search is O(log n)

55
Analysis Of Recursive Fibonacci
long fibonacci (int n) { // Recursively calculates Fibonacci number
if( n == 1 || n == 2)
return 1;
else
return fibonacci(n – 1) + fibonacci(n – 2);
}

T(n) = c if n = 1 or n = 2 (1)
T(n) = T(n – 1) + T(n – 2) + b if n > 2 (2)

We determine a lower bound on T(n):

Expanding: T(n) = T(n - 1) + T(n - 2) + b


≥ T(n - 2) + T(n-2) + b
= 2T(n - 2) + b
= 2[T(n - 3) + T(n - 4) + b] + b by substituting T(n - 2) in (2)
 2[T(n - 4) + T(n - 4) + b] + b
= 22T(n - 4) + 2b + b
= 22[T(n - 5) + T(n - 6) + b] + 2b + b by substituting T(n - 4) in (2)
≥ 23T(n – 6) + (22 + 21 + 20)b
...
 2kT(n – 2k) + (2k-1 + 2k-2 + . . . + 21 + 20)b
= 2kT(n – 2k) + (2k – 1)b
The base case is reached when n – 2k = 2  k = (n - 2) / 2
Hence T(n) ≥ 2 (n – 2) / 2 T(2) + [2 (n - 2) / 2 – 1]b
= (b + c)2 (n – 2) / 2 – b
= [(b + c) / 2]*(2)n/2 – b  Recursive Fibonacci is exponential
Algorithmic Notation

Example:
Write an algorithm for finding the location of the largest element of an array Data.

Largest-Item (Data, N, Loc)


1. [Initialize] Set k:=1, Loc:=1 and Max:=Data[1]
2. [Increment Counter] Set k:=k+1
3. [Test Counter] If k > N , then :
Write: Loc, Max and Exit
4. [Compare and Update] If Max < Data[k] ,then :
Set Loc:=k and Max:=Data[k]
5. [Repeat loop] Go to Step 2.

DEPARTMENT OF SOFTWARE ENGINEERING


Analysis of Algorithm Properties
• Measuring an input’s size
• Measuring Running Time
• Orders of Growth (of algorithm’s efficiency
function)-based on the variation of inputs
– Best-case – first found
– Average-case – mid find
– Worst-case – last found or not found

DEPARTMENT OF SOFTWARE ENGINEERING


Worst, Best, Average Cases
• Efficiency depends on input size n
• For some algorithms, efficiency depends on
the type of input
• Example: Linear Search
– Given a list of n elements and a search key k,
find if k is in the list
– Scan list, compare elements with k until either
found a match (success), or list is exhausted
(failure)

DEPARTMENT OF SOFTWARE ENGINEERING


Linear Search Algorithm
ALGORITHM LinearSearch(A[0..n-1], k)
//Input: A[0..n-1] and k
//Output: Index of first match or -1 if no match is //found
i <- 0
while i < n and A[i] ≠ k do
i <- i+1
if i < n
return i //A[i] = k
else
return -1

DEPARTMENT OF SOFTWARE ENGINEERING


Worst Case Time for Linear Search

• For an array of n elements, the worst case time


for serial search requires n array accesses: O(n).
• Consider cases where we must loop over all n
records:
– desired record appears in the last position of
the array
– desired record does not appear in the array at
all

DEPARTMENT OF SOFTWARE ENGINEERING


Worst Case for Linear Search

Assumptions:
1. All keys are equally likely in a search
2. We always search for a key that is in the array
Example:
• We have an array of 10 records.
• If search for the first record, then it requires 1 array
access; if the second, then 2 array accesses. etc.
The average of all these searches is:
(1+2+3+4+5+6+7+8+9+10)/10 = 5.5

DEPARTMENT OF SOFTWARE ENGINEERING


Worst Case Time for Linear Search
Generalize for array size n.

Expression for average-case running time:

=(1+2+…+n)/n
= n(n+1)/2n
= (n+1)/2

Therefore, average case time complexity for serial search is


O(n).

DEPARTMENT OF SOFTWARE ENGINEERING


Binary Search
• Requires a sorted array or a binary search
tree.

• Cuts the “search space” in half each time.

• Keeps cutting the search space in half until


the target is found or has exhausted the all
possible locations.

DEPARTMENT OF SOFTWARE ENGINEERING


Binary Search Algorithm
look at “middle” element

if no match then
look left (if need smaller))
or right (if greater)

DEPARTMENT OF SOFTWARE ENGINEERING


The Binary Search Algorithm

• Return found or not found (true or false), so


it should be a function.

• When move left or right, change the array


boundaries
– We’ll need a first and last

DEPARTMENT OF SOFTWARE ENGINEERING


The Binary Search Algorithm
calculate middle position

if (first and last have “crossed”) then


“Item not found”

elseif (element at middle = to_find) then


“Item Found”

elseif to_find < element at middle then


Look to the left

else
Look to the right
DEPARTMENT OF SOFTWARE ENGINEERING
Looking Left
• Use indices “first” and “last” to keep track of
where we are looking
• Move left by setting last = middle – 1

7 12 42 59 71 86 104 212

F L M L

DEPARTMENT OF SOFTWARE ENGINEERING


Looking Right
• Use indices “first” and “last” to keep track of
where we are looking
• Move right by setting first = middle + 1

7 12 42 59 71 86 104 212

F M F L

DEPARTMENT OF SOFTWARE ENGINEERING


Binary Search Example – Found

7 12 42 59 71 86 104 212

F M L

Looking for 42

DEPARTMENT OF SOFTWARE ENGINEERING


Binary Search Example – Found

7 12 42 59 71 86 104 212

F M L

Looking for 42

DEPARTMENT OF SOFTWARE ENGINEERING


Binary Search Example – Found

7 12 42 59 71 86 104 212

F
M
L
42 found – in 3 comparisons

DEPARTMENT OF SOFTWARE ENGINEERING


Binary Search Example – Not Found

7 12 42 59 71 86 104 212

F M L

Looking for 89

DEPARTMENT OF SOFTWARE ENGINEERING


Binary Search Example – Not Found

7 12 42 59 71 86 104 212

F M L

Looking for 89

DEPARTMENT OF SOFTWARE ENGINEERING


Binary Search Example – Not Found

7 12 42 59 71 86 104 212

F L
M
Looking for 89

DEPARTMENT OF SOFTWARE ENGINEERING


Binary Search Example – Not Found

7 12 42 59 71 86 104 212

L F

89 not found – 3 comparisons

DEPARTMENT OF SOFTWARE ENGINEERING


Binary Search Function
Function Find return boolean (A Array, first, last, to_find)

middle <- (first + last) div 2

if (first > last) then


return false
elseif (A[middle] = to_find) then
return true
elseif (to_find < A[middle]) then
return Find(A, first, middle–1, to_find)
else
return Find(A, middle+1, last, to_find)
endfunction

DEPARTMENT OF SOFTWARE ENGINEERING


Binary Search Analysis: Best Case

Best Case:
1 comparison

Best Case: match from the firs comparison

1 7 9 12 33 42 59 76 81 84 91 92 93 99

Target: 59
DEPARTMENT OF SOFTWARE ENGINEERING
Binary Search Analysis: Worst Case

How many comparisons??

Worst Case: divide until reach one item, or no match.

DEPARTMENT OF SOFTWARE ENGINEERING


Binary Search Analysis: Worst Case

• With each comparison we throw away ½ of the list


N ………… 1 comparison

N/2 ………… 1 comparison


Number of steps is at
most Log2N
N/4 ………… 1 comparison

………… 1 comparison
N/8
.
.
.
………… 1 comparison
1
DEPARTMENT OF SOFTWARE ENGINEERING
Analysis of Linear and Binary Search

• Binary search reduces the work by half at each


comparison

• If array is not sorted  Linear Search


– Best Case O(1)
– Worst Case O(N)

• If array is sorted  Binary search


– Best Case O(1)
– Worst Case O(Log2N)

DEPARTMENT OF SOFTWARE ENGINEERING


Insertion Sort

DEPARTMENT OF SOFTWARE ENGINEERING


Definition
• Insertion sort is a simple sorting algorithm
that builds the final sorted array (or list) one
item at a time. It is much less efficient on large
lists than more advanced algorithms such as
quick sort, heap sort, or merge sort.
• More efficient than selection sort and bubble
sort

DEPARTMENT OF SOFTWARE ENGINEERING


Two Steps
• Selection
• Compare – Shift - Insert

DEPARTMENT OF SOFTWARE ENGINEERING


Example of insertion sort
8 2 4 9 3 6

DEPARTMENT OF SOFTWARE ENGINEERING


Example of insertion sort
8 2 4 9 3 6

DEPARTMENT OF SOFTWARE ENGINEERING


Example of insertion sort
8 2 4 9 3 6

2 8 4 9 3 6

DEPARTMENT OF SOFTWARE ENGINEERING


Example of insertion sort
8 2 4 9 3 6

2 8 4 9 3 6

DEPARTMENT OF SOFTWARE ENGINEERING


Example of insertion sort
8 2 4 9 3 6

2 8 4 9 3 6

2 4 8 9 3 6

DEPARTMENT OF SOFTWARE ENGINEERING


Example of insertion sort
8 2 4 9 3 6

2 8 4 9 3 6

2 4 8 9 3 6

DEPARTMENT OF SOFTWARE ENGINEERING


Example of insertion sort
8 2 4 9 3 6

2 8 4 9 3 6

2 4 8 9 3 6

2 4 8 9 3 6

DEPARTMENT OF SOFTWARE ENGINEERING


Example of insertion sort
8 2 4 9 3 6

2 8 4 9 3 6

2 4 8 9 3 6

2 4 8 9 3 6

DEPARTMENT OF SOFTWARE ENGINEERING


Example of insertion sort
8 2 4 9 3 6

2 8 4 9 3 6

2 4 8 9 3 6

2 4 8 9 3 6

DEPARTMENT OF SOFTWARE ENGINEERING


2 4 8 9 6
3

2 4 8 9 6
3

2 4 8 9 6
3

2 3 4 8 9 6

DEPARTMENT OF SOFTWARE ENGINEERING


Example of insertion sort
2 3 4 8 9 6

6
2 3 4 8 6

6
2 3 4 6 8

DEPARTMENT OF SOFTWARE ENGINEERING


Example of insertion sort
8 2 4 9 3 6

2 8 4 9 3 6

2 4 8 9 3 6

2 4 8 9 3 6

2 3 4 8 9 6

2 3 4 6 8 9 done

DEPARTMENT OF SOFTWARE ENGINEERING


An array of Elements - Unordered
• Left – Sorted
• Right – Unsorted
Steps
1. Key selection – Always the first key element is the second element in the
array
2. Compare with previous element
3. If Key smaller than previous – Swap with previous element until key get
inserted in correct position (While loop)
4. Else Change the key element as the next element to the previous key
5. Repeat step 1 to 4 until the n-1th element becomes the key and get
inserted in correct position (For loop)

DEPARTMENT OF SOFTWARE ENGINEERING


Algorithm
InsertionSort(A[0…n-1])
//Sorts a given array by Insertion Sort
//Input: An array A[0…n-1] of n elements
//Output: Array A[0…n-1] sorted in nondecreasing order
for j ← 1 to n-1
do key ← A[ j ] Selection
i←j-1
Comparison
while i ≥ 0 and A[i] > key
do A[i + 1] ← A[i]
i←i–1 Shift
A[i + 1] ← key

Insert

DEPARTMENT OF SOFTWARE ENGINEERING


Complexity
• Space/Memory
• Time
– Count a particular operation
– Count number of steps
– Asymptotic complexity

DEPARTMENT OF SOFTWARE ENGINEERING


Best Case Analysis
Elements 1 2 3 45 6
J=1, key=a[1]=2
i=0, while 0>=0 and a[0]>key
1>2,false ,hence go to for loop
j=2,key=a[2]=3
i=1,while 1>=0 and a[1]>key
2>3,false, hence go to for loop
Hence the comparisons are
j=1, i=0 1>2
j=2, i=1 2>3
j=3, i=2 3>4
j=4, i=3 4>5
j=5 ,i=4 5>6

While loop runs for (n-1) times and the cond is false.
Hence (n-1) times execution is done.
So the order is O(n-1)=O(n)
DEPARTMENT OF SOFTWARE ENGINEERING
Worst-Case Analysis
Elements 6 54 3 2 1
In Ist pass= 1 comparison,1 swap
In 2nd pass= 2 comparison,2 swap
In 3rd pass= 3 comparison,3 swap
……
We have (n-1) passes=5 passes (i.e) 1 + 2 + 3 + 4 + 5
(n-5)+(n-4)+(n-3)+(n-2)+(n-1)

total compares = 1 + 2 + 3 + … + (n-1)


= (n-1)n/2
=n2-n/2
=O(n2 )
Advantages and disadvantages of Insertion
sort
Advantages:
• They are efficient for small sets of data
• They are simply implemented
• Insertion sorts pass through the array only once
• They are adaptive; efficient for data sets that
are already sorted.
Disadvantages:
• It’s less efficient on larger list and arrays.
DEPARTMENT OF SOFTWARE ENGINEERING
The Bubble Sort
• Bubble sort, sometimes referred to as
sinking sort, is a simple sorting algorithm that
repeatedly steps through the list to be sorted,
compares each pair of adjacent items and
swaps them if they are in the wrong order.

DEPARTMENT OF SOFTWARE ENGINEERING


The Bubble Sort
• Each scan will push the maximum
element to the top
• 3562891
• 3562891
• 3526891
• 3526891
• 3526891
• 3526819

DEPARTMENT OF SOFTWARE ENGINEERING


The Bubble Sort
• Starts at one end of the array and make
repeated scans through the list comparing
successive pairs of elements
• 5362891
• If the first element is larger than the second,
called an inversion, then the values are
swapped
• 3562891

DEPARTMENT OF SOFTWARE ENGINEERING


The Bubble Sort
• This is the “bubbling” effect that gives
the bubble sort its name
• This process is continued until the list is
sorted
• The more inversions in the list, the longer
it takes to sort

DEPARTMENT OF SOFTWARE ENGINEERING


"Bubbling Up" the Largest Element
• Traverse a collection of elements
– Move from the front to the end
– “Bubble” the largest value to the end using
pair-wise comparisons and swapping

1 2 3 4 5 6

77 42 35 12 101 5

DEPARTMENT OF SOFTWARE ENGINEERING


"Bubbling Up" the Largest Element
• Traverse a collection of elements
– Move from the front to the end
– “Bubble” the largest value to the end using pair-
wise comparisons and swapping

1 2 3 4 5 6

42 Swap 42
77 77 35 12 101 5

DEPARTMENT OF SOFTWARE ENGINEERING


"Bubbling Up" the Largest Element
• Traverse a collection of elements
– Move from the front to the end
– “Bubble” the largest value to the end using pair-
wise comparisons and swapping

1 2 3 4 5 6

42 7735 Swap 35
77 12 101 5

DEPARTMENT OF SOFTWARE ENGINEERING


"Bubbling Up" the Largest Element
• Traverse a collection of elements
– Move from the front to the end
– “Bubble” the largest value to the end using pair-
wise comparisons and swapping

1 2 3 4 5 6

42 35 12 Swap 12
77 77 101 5

DEPARTMENT OF SOFTWARE ENGINEERING


"Bubbling Up" the Largest Element
• Traverse a collection of elements
– Move from the front to the end
– “Bubble” the largest value to the end using pair-
wise comparisons and swapping

1 2 3 4 5 6

42 35 12 77 101 5

No need to swap

DEPARTMENT OF SOFTWARE ENGINEERING


"Bubbling Up" the Largest Element
• Traverse a collection of elements
– Move from the front to the end
– “Bubble” the largest value to the end using pair-
wise comparisons and swapping

1 2 3 4 5 6

42 35 12 77 5 Swap 101
101 5

DEPARTMENT OF SOFTWARE ENGINEERING


"Bubbling Up" the Largest Element
• Traverse a collection of elements
– Move from the front to the end
– “Bubble” the largest value to the end using pair-
wise comparisons and swapping

1 2 3 4 5 6

42 35 12 77 5 101

Largest value correctly placed

DEPARTMENT OF SOFTWARE ENGINEERING


The “Bubble Up” Algorithm
index <- 1
last_compare_at <- n – 1

loop
exitif(index > last_compare_at)
if(A[index] > A[index + 1]) then
Swap(A[index], A[index + 1])
endif
index <- index + 1
endloop
DEPARTMENT OF SOFTWARE ENGINEERING
Items of Interest
• Notice that only the largest value is correctly
placed
• All other values are still out of order
• So we need to repeat this process

1 2 3 4 5 6

42 35 12 77 5 101

Largest value correctly placed

DEPARTMENT OF SOFTWARE ENGINEERING


Repeat “Bubble Up” How Many Times?
• If we have N elements…

• And if each time we bubble an element, we


place it in its correct location…

• Then we repeat the “bubble up” process N –


1 times.

• This guarantees we’ll correctly


place all N elements.

DEPARTMENT OF SOFTWARE ENGINEERING


“Bubbling” All the Elements
1 2 3 4 5 6
42 35 12 77 5 101

1 2 3 4 5 6
35 12 42 5 77 101

1 2 3 4 5 6
N-1

12 35 5 42 77 101

1 2 3 4 5 6
12 5 35 42 77 101

1 2 3 4 5 6
5 12 35 42 77 101

DEPARTMENT OF SOFTWARE ENGINEERING


Reducing the Number of Comparisons
1 2 3 4 5 6
77 42 35 12 101 5
1 2 3 4 5 6
42 35 12 77 5 101

1 2 3 4 5 6
35 12 42 5 77 101

1 2 3 4 5 6
12 35 5 42 77 101

1 2 3 4 5 6
12 5 35 42 77 101

DEPARTMENT OF SOFTWARE ENGINEERING


Reducing the Number of Comparisons
• On the Nth “bubble up”, we only need to
do MAX-N comparisons.
• For example:
– This is the 4th “bubble up”
– MAX is 6
– Thus we have 2 comparisons to do
1 2 3 4 5 6
12 35 5 42 77 101

DEPARTMENT OF SOFTWARE ENGINEERING


procedure Bubblesort(A isoftype in/out Arr_Type)
to_do, index isoftype Num
to_do <- N – 1

loop
exitif(to_do = 0)
index <- 1
loop
exitif(index > to_do)
if(A[index] > A[index + 1]) then

Outer loop
Inner loop
Swap(A[index], A[index + 1])
endif
index <- index + 1
endloop
to_do <- to_do - 1
endloop
endprocedure // Bubblesort

DEPARTMENT OF SOFTWARE ENGINEERING


Best Case of the Bubble Sort

• Let X: number of interchanges (discrete).


• Consider list of n elements already sorted
x1 < x2 < x3 < … < xn

DEPARTMENT OF SOFTWARE ENGINEERING


Best Case (cont.)
• Only have to run through the list once
since it is already in order, thus giving
you n-1 comparisons and X = 0.
• Therefore, the time complexity of the best
case scenario is O(n).

DEPARTMENT OF SOFTWARE ENGINEERING


Worst Case of the Bubble Sort

• Let X: number of interchanges


(discrete).
• Consider list of n elements in
descending order
x1 > x2 > x3 > … > xn.

DEPARTMENT OF SOFTWARE ENGINEERING


Worst Case (cont.)

• # of comparisons = # of interchanges for


each pass.
• X = (n-1) + (n-2) + (n-3) + … + 2 + 1
• This is an arithmetic series.

DEPARTMENT OF SOFTWARE ENGINEERING


Worst Case (cont.)

• (n-1) + (n-2) + … + 2 + 1 = n(n-1)/2


• So X = n(n-1)/2.
• Therefore, the time complexity of the
worst case scenario is O(n2).

DEPARTMENT OF SOFTWARE ENGINEERING


Advantages and disadvantages of Bubble
sort
Advantages:
• Popular and easy to implement
• Elements are swapped in place without using
additional temporary storage, so the space
requirements is at minimum
Disadvantages:
• Does not deal well with a list containing a
huge number of items
DEPARTMENT OF SOFTWARE ENGINEERING
Mock Questions
• Compute the analysis of the algorithm to find the
largest element in an array.
• Compute the analysis of the algorithm to find the
unique element in an array.
• Compute the analysis of factorial of n nos.
• Compute the analysis of selection sort.
• Compute the analysis of merge sort.
• Compute the analysis of quick sort.
• Compute the analysis of shell sort
DEPARTMENT OF SOFTWARE ENGINEERING

You might also like