15Cs201J-Data Structures: Unit-I

15CS201J-DATA STRUCTURES
UNIT-I
INTRODUCTION
TO
DATA STRUCTURES
DEPARTMENT OF SOFTWARE ENGINEERING

Assignment Method:
Cycle Test I - 10%
Cycle Test II - 15%
Cycle Test III - 15%
Surprise Test - 5%
Assignment - 5%
---------------------------------
Total - 50%

Text Books
1. Seymour Lipschutz, “ Data Structures with
C”,McGraw Hill Education, Special Indian
Edition, 2014.
2. Mark Allen Weiss, “Data Structures and
Algorithm Analysis in C”, 2nd Edition,
Pearson Education, 2011.

Outline
• Introduction
• Basic terminology
• Data structures–operations
• ADT– Algorithms:
• Complexity, Time – Space trade off
• Mathematical notations and functions
• Asymptotic notations
• Linear and Binary search
• Bubble sort -Insertion sort
Elementary Data Organization
• Data are simply values or sets of values.

• Collection of data are frequently organized into a
hierarchy of fields, records and files.
• This organization of data may not complex enough to
maintain and efficiently process certain collections of
data.
• For this reason, data are organized into more complex
type of structures called Data Structures.
DATA vs INFORMATION
DATA INFORMATION
• Raw Fact – about anything • Processed data
• Computers need data • Human needs information
• Data doesn’t depend on • Information depends on
Information data.
• Input to any system may be • Output after processing the
treated as Data data given to the system is
Information.
• Data – Raw Material • Information – Product
• Data may not be in the order. • Information should be in the
order.

Information
• Data + Meaning
• Processed data (Making data meaningful and
useful)
• Subset of Data
• Manipulated raw data

Example
Data Information
• Ingredients • Recipe
• 01/01, or 01012015, or • New Year’s Day, First day
20150101 – Date as data in a year
• Each student's test score is • Performance of a student,
one piece of data. Average score of a class,
Performance of a school -
information, derived from
the given data.

Processing of data
• Collection of Data
• Organizing in a structure
• Manipulation of data for need

Data Structure
 Organizing Data in computer memory - To use
efficiently (Algorithm Efficiency)
 Way of collecting and organising data in such a
way that we can perform operations on these data
in an effective way
 Anything that can store data
 Rendering data elements in terms of some
relationship, for better organization and storage
 Store ordered data
Data Structure
Primitive Data Structure Non – Primitive/User Defined

Data Structure
Integer Float Character Boolean Linear Data Non - Linear

Structure Data Structure
Arrays Stack Queues Linked List Trees Graphs

Linear Data Structure
 If the elements organized in the data structure form a
sequence
 Data element relationship/Arrangement in linear fashion
 Representation of linear data structures in memory – (Two
Ways)
1. Elements in Consecutive/sequential
memory locations - Arrays
2. Elements with link to other element – Linked List
 Other Linear Data Structures
Stacks, Queues
Non – Linear Data Structure
• Data's are not arranged in sequence
• Example:
Trees and Graphs

Static Data Structure
 Memory size is fixed
 Size of the Data Structure – Predefined
 Memory Allocated at compile time
Advantage:
No need to check/Keep track of size of the data
structure
Disadvantage:
Memory can’t be utilized efficiently as it is assigned even
after deletion of values

Dynamic Data Structure
 Memory is allocated to the data structure as the program
executes – Dynamically (at run time)
 Used when size of the data structure is not known in
advance
 Memory size – not fixed used as need
Advantage:
Memory is used efficiently, assigned only as
much is needed.
Disadvantage:
Should keep track of memory utilized
Data Structure Operations
1. Traversing: Accessing each record exactly once so that certain items in the
record may be processed.
2. Searching: Finding the location of the record with a given key value.
3. Inserting: Adding a new record to the structure.
4. Deleting: Removing a record from the structure.
5. Sorting: Arranging the records in some logical order.
6. Merging: Combing the records in two different sorted files into a single
sorted file.DEPARTMENT OF SOFTWARE ENGINEERING

Algorithm
Problem
Algorithm
Input “Computer” Output

Algorithm
An Algorithm is a sequence of unambiguous
instructions for solving a problem,
i.e., for obtaining a required output for any
legitimate input in a finite amount of time.

Properties of Algorithm
1. Finiteness
Terminates after a finite number of steps
2. Definiteness
Clear, rigorously (Exact and Accurate) and unambiguously
specified
3. Input
valid inputs are clearly specified
4. Clearly specified/expected output
can be proved to produce the correct output given a valid input
5. Effectiveness
Steps are sufficiently simple and basic

Types of Algorithm
1) Iterative
A()
{
for i=1 to n
max(a,b)
}
2) Recursive
A(n)
{
if( )
A(n/2)
}
Algorithmic Efficiency Function

Basic Efficiency Classes
Class Name Comments
1 constant May be in best cases
lgn logarithmic Halving problem size at each
iteration
n linear Scan a list of size n
n×lgn linearithmic Divide and conquer
algorithms, e.g., mergesort
n2 quadratic Two embedded loops, e.g.,

selection sort
n3 cubic Three embedded loops, e.g.,
matrix multiplication
2n exponential All subsets of n-elements

set
n! factorial All permutations of an n-
elements set
Measure running time in terms of # of
basic operations
• Basic operation: the operation that
contributes the most to the total running time
of an algorithm
• Usually the most time consuming operation in
the algorithm’s innermost loop
Input size and basic operation
examples
Problem Measure of input Basic operation
size
Search for a key in a # of items in the list Key comparison

list of n items
Add two n×n matrices Dimensions of the Addition

matrices, n
Polynomial evaluation Order of the Multiplication

polynomial
Asymptotic Notations
• 3 notations used to compare orders of
growth of an algorithm’s basic operation
count
– O(g(n)): Set of functions that grow no faster
than g(n)-Worst Case
– Ω(g(n)): Set of functions that grow at least as
fast as g(n)-Best Case
– Θ(g(n)): Set of functions that grow at the same
rate as g(n)-Average Case

O-Notation (contd.)
• Definition: A function t(n) is said to be in
O(g(n)), denoted t(n) є O(g(n)), if t(n) is
bounded above by some positive constant
multiple of g(n) for sufficiently large n.
• If we can find +ve constants c and n0 such
that:
t(n) ≤ c × g(n) for all n ≥ n0

O(big oh)-Notation
c × g(n)
t(n)
Doesn’t
matter
n
n0
t(n) є O(g(n))
Ω-Notation (contd.)
Ω(g(n)) denoted t(n) є Ω(g(n)), if t(n) is
bounded below by some positive constant
multiple of g(n) for all sufficiently large n.
• If we can find +ve constants c and n0 such
that
t(n) ≥ c × g(n) for all n ≥ n0

Ω(big omega)-Notation
t(n)
c × g(n)
Doesn’t
matter
n
n0
t(n) є Ω(g(n))

Θ-Notation (contd.)
Θ(g(n)) denoted t(n) є Θ(g(n)), if t(n) is
bounded both above and below by some
positive constant multiples of g(n) for all
sufficiently large n.
• If we can find +ve constants c1, c2, and n0
such that
c2×g(n) ≤ t(n) ≤ c1×g(n) for all n ≥ n0

Θ(big theta)-Notation
c1 × g(n)
t(n)
c2 × g(n)
Doesn’t
matter
n
n0
t(n) є Θ(g(n))

Complexity of Algorithms
• The complexity of an algorithm M is the function
f(n) which gives the running time and/or storage

space requirement of the algorithm in terms of the
size n of the input data.
• Two types of complexity
1. Time Complexity
2. Space Complexity

Performance Analysis of Algorithm
An investigation of an algorithm’s efficiency with respect

to two resources:
 Space Complexity
Amount of memory an algorithm needs to perform the
computation
 Time Complexity
Amount of CPU time an algorithm needs to run to
completion.

Space Complexity
S(P) = c + Sp(instance)
Memory space S(P) needed by a program P, consists of two components:
– A fixed part: needed for instruction space (Space for the code), simple
variable space, constants space etc.  c (Ignore c during calculation)
– A variable part: dependent on a particular instance of input and output
data.  Sp(instance)
 For constant no of variables (constant space complexity) –O(1)
 For an array of n elements ( linear space complexity) – O(n)

Example-1
1. Algorithm abc (a, b, c)
2. {
3. return a+b+b*c+(a+b-c)/(a+b)+4.0;
4. }
Space required to store variables: a, b, and c.
Sp()= 3. S(P) = 3.

Example 2
Algorithm Sum(a[], n)
{
s:= 0.0; n=1
for i = 1 to n do a[ ] = n (for n elements)
s := s + a[i]; i=1
return s; s=1
} Sp(n) = (n + 3).
Hence S(P) = (n + 3).

Time Complexity
• T(P) = c + tp(instance)
• Time required T(P) to run a program P also
consists of two components:
– A fixed part: compile time which is independent of
the problem instance  c. (Ignore c during
calculation)
– A variable part: run time which depends on the
problem instance  tp(instance)

Theoretical Analysis of Time
Efficiency
• Count the number of times the algorithm’s basic
operation is executed on inputs of size n: C(n)
Input size Ignore cop,
T(n) ≈ cop + C(n) Focus on
orders of
growth
# of times basic op.

Running time
Execution time for is executed
basic operation

Example-1
Algorithm Sum(a[],n)
{
S = 0.0; 1
for i=1 to n do n
s = s+a[i]; n
return s; 1
}
2n + 2=O(n)
Example-2
Algorithm Sum(a[],n,m)
{
for i=1 to n do; n
for j=1 to m do nm
s = s+a[i][j]; nm
return s; 1
}
2nm + n + 1
=O(nm)
Iterative - Example
1) A()
{
int i;
for(i=1 to n)
printf (“abc”);
}
Soln: O(n)

Iterative - Example
2) A()
{
int I;
for(i=1 to n)
for(j=1 to n)
printf(“abc”);
}
Soln:
nxn= n2
Iterative - Example
3) A()
{
for(i=1;i^2<=n;i++)
print(“ “);
}
Soln:
n= i2
i=√n

Iterative - Example
4) A()
{
int i,j,k,n; Soln:
for (i=1;i<=n;i++)=n
{
for (j=1;j<=i^2;j++) =nx n2
{ = n3
for(k=1;k<=n/2;k++) = n/2 x n3
{ = O(n4)
printf(“ “);
}
}
}

Example
ALGORITHM MaxElement(A[0..n-1])
//Determines largest element
maxval <- A[0]
Input size: n
for i <- 1 to n-1 do Basic operation: > or <-
if A[i] > maxval
maxval <- A[i]
return maxval
C(n) =
єΘ(n)
Example
ALGORITHM UniqueElements(A[0..n-1])
//Determines whether all elements are //distinct
for i <- 0 to n-2 do
for j <- i+1 to n-1 do
if A[i] = A[j]
return false
return true
Input size: n
Basic operation: A[i] = A[j]
Does C(n) depend on type of input?
UniqueElements (contd.)
fori<- 0 to n-2 do
for j <- i+1 to n-1 do
if A[i] = A[j]
return false
return true
Cworst(n) =
Why Cworst(n) is better than

saying Cworst(n)
General Plan for Recursive Algorithms
• Decide on input size parameter
• Identify the basic operation
• Does C(n) depends also on input type?
• Set up a recurrence relation
• Solve the recurrence or, at least establish
the order of growth of its solution
Solving Recurrence Relations
• To solve a recurrence relation T(n) we need to derive a form of

T(n) that is not a recurrence relation. Such a form is called a
closed form of the recurrence relation.
• There are five methods to solve recurrence relations that
represent the running time of recursive methods:
 Iteration method (unrolling and summing)
 Substitution method (Guess the solution and verify by induction)
 Recursion tree method
 Master theorem (Master method)
 Using Generating functions or Characteristic equations
• In this course, we will use the Iteration method
Iteration method
• Steps:
 Expand the recurrence
 Express the expansion as a summation by plugging the recurrence back into itself until you
see a pattern.
 Evaluate the summation
• In evaluating the summation one or more of the following summation formulae may be used:
• Arithmetic series:
• Special Cases of Geometric Series:
 Geometric Series:
Solving Recurrence Relations - Iteration method (Cont’d)
 Harmonic Series:
 Others:
Example-1
Factorial
ALGORITHM F(n)
// Output: n! Input size: n
Basic operation: ×
if n = 0
return 1 M(n) = M(n-1) + 1 for n > 0
else
to compute
return F(n-1)xn F(n-1) to multiply
n and F(n-1)
Analysis Of Recursive Factorial method
 Example1: Form and solve the recurrence relation for the running time
of factorial method and hence determine its big-O complexity:
long factorial (int n) {
if (n == 0)
return 1;
else
return n * factorial (n – 1);
}
T(0) = c (1)
T(n) = b + T(n - 1) (2)
= b + b + T(n - 2) by subtituting T(n – 1) in (2)
= b +b +b + T(n - 3) by substituting T(n – 2) in (2)
…
= kb + T(n - k)
The base case is reached when n – k = 0  k = n, we then have:
T(n) = nb + T(n - n)
= bn + T(0)
= bn + c
Therefore the method factorial is O(n)
53
Analysis Of Recursive Binary Search
public int binarySearch (int target, int[] array,

int low, int high) {
if (low > high)
return -1;
else {
int middle = (low + high)/2;
if (array[middle] == target)
return middle;
else if(array[middle] < target)
return binarySearch(target, array, middle + 1, high);
else
return binarySearch(target, array, low, middle - 1);
}
}
• The recurrence relation for the running time of the method is:
T(1) = a if n = 1 (one element array)
T(n) = T(n / 2) + b if n > 1
Analysis Of Recursive Binary Search (Cont’d)
Without loss of generality, assume n, the problem size, is a multiple of 2, i.e., n = 2k
Expanding:
T(1) = a (1)
T(n) = T(n / 2) + b (2)
= [T(n / 22) + b] + b = T (n / 22) + 2b by substituting T(n/2) in (2)
= [T(n / 23) + b] + 2b = T(n / 23) + 3b by substituting T(n/22) in (2)
= ……..
= T( n / 2k) + kb
The base case is reached when n / 2k = 1  n = 2k  k = log2 n, we then

have:
T(n) = T(1) + b log2 n

= a + b log2 n
Therefore, Recursive Binary Search is O(log n)
55
Analysis Of Recursive Fibonacci
long fibonacci (int n) { // Recursively calculates Fibonacci number
if( n == 1 || n == 2)
return 1;
else
return fibonacci(n – 1) + fibonacci(n – 2);
}
T(n) = c if n = 1 or n = 2 (1)
T(n) = T(n – 1) + T(n – 2) + b if n > 2 (2)
We determine a lower bound on T(n):
Expanding: T(n) = T(n - 1) + T(n - 2) + b

≥ T(n - 2) + T(n-2) + b
= 2T(n - 2) + b
= 2[T(n - 3) + T(n - 4) + b] + b by substituting T(n - 2) in (2)
 2[T(n - 4) + T(n - 4) + b] + b
= 22T(n - 4) + 2b + b
= 22[T(n - 5) + T(n - 6) + b] + 2b + b by substituting T(n - 4) in (2)
≥ 23T(n – 6) + (22 + 21 + 20)b
...
 2kT(n – 2k) + (2k-1 + 2k-2 + . . . + 21 + 20)b
= 2kT(n – 2k) + (2k – 1)b
The base case is reached when n – 2k = 2  k = (n - 2) / 2
Hence T(n) ≥ 2 (n – 2) / 2 T(2) + [2 (n - 2) / 2 – 1]b
= (b + c)2 (n – 2) / 2 – b
= [(b + c) / 2]*(2)n/2 – b  Recursive Fibonacci is exponential
Algorithmic Notation
Example:
Write an algorithm for finding the location of the largest element of an array Data.
Largest-Item (Data, N, Loc)

1. [Initialize] Set k:=1, Loc:=1 and Max:=Data[1]
2. [Increment Counter] Set k:=k+1
3. [Test Counter] If k > N , then :
Write: Loc, Max and Exit
4. [Compare and Update] If Max < Data[k] ,then :
Set Loc:=k and Max:=Data[k]
5. [Repeat loop] Go to Step 2.

Analysis of Algorithm Properties
• Measuring an input’s size
• Measuring Running Time
• Orders of Growth (of algorithm’s efficiency
function)-based on the variation of inputs
– Best-case – first found
– Average-case – mid find
– Worst-case – last found or not found

Worst, Best, Average Cases
• Efficiency depends on input size n
• For some algorithms, efficiency depends on
the type of input
• Example: Linear Search
– Given a list of n elements and a search key k,
find if k is in the list
– Scan list, compare elements with k until either
found a match (success), or list is exhausted
(failure)

Linear Search Algorithm
ALGORITHM LinearSearch(A[0..n-1], k)
//Input: A[0..n-1] and k
//Output: Index of first match or -1 if no match is //found
i <- 0
while i < n and A[i] ≠ k do
i <- i+1
if i < n
return i //A[i] = k
else
return -1

Worst Case Time for Linear Search
• For an array of n elements, the worst case time

for serial search requires n array accesses: O(n).
• Consider cases where we must loop over all n
records:
– desired record appears in the last position of
the array
– desired record does not appear in the array at
all

Worst Case for Linear Search
Assumptions:
1. All keys are equally likely in a search
2. We always search for a key that is in the array
Example:
• We have an array of 10 records.
• If search for the first record, then it requires 1 array
access; if the second, then 2 array accesses. etc.
The average of all these searches is:
(1+2+3+4+5+6+7+8+9+10)/10 = 5.5

Worst Case Time for Linear Search
Generalize for array size n.
Expression for average-case running time:
=(1+2+…+n)/n
= n(n+1)/2n
= (n+1)/2
Therefore, average case time complexity for serial search is

O(n).

Binary Search
• Requires a sorted array or a binary search
tree.
• Cuts the “search space” in half each time.
• Keeps cutting the search space in half until

the target is found or has exhausted the all
possible locations.

Binary Search Algorithm
look at “middle” element
if no match then
look left (if need smaller))
or right (if greater)

The Binary Search Algorithm
• Return found or not found (true or false), so

it should be a function.
• When move left or right, change the array

boundaries
– We’ll need a first and last

The Binary Search Algorithm
calculate middle position
if (first and last have “crossed”) then

“Item not found”
elseif (element at middle = to_find) then

“Item Found”
elseif to_find < element at middle then

Look to the left
else
Look to the right
Looking Left
• Use indices “first” and “last” to keep track of
where we are looking
• Move left by setting last = middle – 1
7 12 42 59 71 86 104 212
F L M L

Looking Right
• Use indices “first” and “last” to keep track of
where we are looking
• Move right by setting first = middle + 1
7 12 42 59 71 86 104 212
F M F L

Binary Search Example – Found
7 12 42 59 71 86 104 212
F M L
Looking for 42

7 12 42 59 71 86 104 212
F M L
Looking for 42

7 12 42 59 71 86 104 212
F
M
L
42 found – in 3 comparisons

Binary Search Example – Not Found
7 12 42 59 71 86 104 212
F M L
Looking for 89

7 12 42 59 71 86 104 212
F M L
Looking for 89

7 12 42 59 71 86 104 212
F L
M
Looking for 89

7 12 42 59 71 86 104 212
L F
89 not found – 3 comparisons

Binary Search Function
Function Find return boolean (A Array, first, last, to_find)
middle <- (first + last) div 2
if (first > last) then

return false
elseif (A[middle] = to_find) then
return true
elseif (to_find < A[middle]) then
return Find(A, first, middle–1, to_find)
else
return Find(A, middle+1, last, to_find)
endfunction

Binary Search Analysis: Best Case
Best Case:
1 comparison
Best Case: match from the firs comparison
1 7 9 12 33 42 59 76 81 84 91 92 93 99
Target: 59
Binary Search Analysis: Worst Case
How many comparisons??
Worst Case: divide until reach one item, or no match.

Binary Search Analysis: Worst Case
• With each comparison we throw away ½ of the list

N ………… 1 comparison
N/2 ………… 1 comparison

Number of steps is at
most Log2N
N/4 ………… 1 comparison
………… 1 comparison
N/8
.
.
.
………… 1 comparison
1
Analysis of Linear and Binary Search
• Binary search reduces the work by half at each

comparison
• If array is not sorted  Linear Search

– Best Case O(1)
– Worst Case O(N)
• If array is sorted  Binary search

– Best Case O(1)
– Worst Case O(Log2N)

Insertion Sort

Definition
• Insertion sort is a simple sorting algorithm
that builds the final sorted array (or list) one
item at a time. It is much less efficient on large
lists than more advanced algorithms such as
quick sort, heap sort, or merge sort.
• More efficient than selection sort and bubble
sort

Two Steps
• Selection
• Compare – Shift - Insert

Example of insertion sort
8 2 4 9 3 6

8 2 4 9 3 6

8 2 4 9 3 6
2 8 4 9 3 6

8 2 4 9 3 6
2 8 4 9 3 6

8 2 4 9 3 6
2 8 4 9 3 6
2 4 8 9 3 6

8 2 4 9 3 6
2 8 4 9 3 6
2 4 8 9 3 6

8 2 4 9 3 6
2 8 4 9 3 6
2 4 8 9 3 6
2 4 8 9 3 6

8 2 4 9 3 6
2 8 4 9 3 6
2 4 8 9 3 6
2 4 8 9 3 6

8 2 4 9 3 6
2 8 4 9 3 6
2 4 8 9 3 6
2 4 8 9 3 6

2 4 8 9 6
3
2 4 8 9 6
3
2 4 8 9 6
3
2 3 4 8 9 6

2 3 4 8 9 6
6
2 3 4 8 6
6
2 3 4 6 8

8 2 4 9 3 6
2 8 4 9 3 6
2 4 8 9 3 6
2 4 8 9 3 6
2 3 4 8 9 6
2 3 4 6 8 9 done

An array of Elements - Unordered
• Left – Sorted
• Right – Unsorted
Steps
1. Key selection – Always the first key element is the second element in the
array
2. Compare with previous element
3. If Key smaller than previous – Swap with previous element until key get
inserted in correct position (While loop)
4. Else Change the key element as the next element to the previous key
5. Repeat step 1 to 4 until the n-1th element becomes the key and get
inserted in correct position (For loop)

Algorithm
InsertionSort(A[0…n-1])
//Sorts a given array by Insertion Sort
//Input: An array A[0…n-1] of n elements
//Output: Array A[0…n-1] sorted in nondecreasing order
for j ← 1 to n-1
do key ← A[ j ] Selection
i←j-1
Comparison
while i ≥ 0 and A[i] > key
do A[i + 1] ← A[i]
i←i–1 Shift
A[i + 1] ← key
Insert

Complexity
• Space/Memory
• Time
– Count a particular operation
– Count number of steps
– Asymptotic complexity

Best Case Analysis
Elements 1 2 3 45 6
J=1, key=a[1]=2
i=0, while 0>=0 and a[0]>key
1>2,false ,hence go to for loop
j=2,key=a[2]=3
i=1,while 1>=0 and a[1]>key
2>3,false, hence go to for loop
Hence the comparisons are
j=1, i=0 1>2
j=2, i=1 2>3
j=3, i=2 3>4
j=4, i=3 4>5
j=5 ,i=4 5>6
While loop runs for (n-1) times and the cond is false.
Hence (n-1) times execution is done.
So the order is O(n-1)=O(n)
Worst-Case Analysis
Elements 6 54 3 2 1
In Ist pass= 1 comparison,1 swap
In 2nd pass= 2 comparison,2 swap
In 3rd pass= 3 comparison,3 swap
……
We have (n-1) passes=5 passes (i.e) 1 + 2 + 3 + 4 + 5
(n-5)+(n-4)+(n-3)+(n-2)+(n-1)
total compares = 1 + 2 + 3 + … + (n-1)

= (n-1)n/2
=n2-n/2
=O(n2 )
Advantages and disadvantages of Insertion
sort
Advantages:
• They are efficient for small sets of data
• They are simply implemented
• Insertion sorts pass through the array only once
• They are adaptive; efficient for data sets that
are already sorted.
Disadvantages:
• It’s less efficient on larger list and arrays.
The Bubble Sort
• Bubble sort, sometimes referred to as
sinking sort, is a simple sorting algorithm that
repeatedly steps through the list to be sorted,
compares each pair of adjacent items and
swaps them if they are in the wrong order.

The Bubble Sort
• Each scan will push the maximum
element to the top
• 3562891
• 3562891
• 3526891
• 3526891
• 3526891
• 3526819

The Bubble Sort
• Starts at one end of the array and make
repeated scans through the list comparing
successive pairs of elements
• 5362891
• If the first element is larger than the second,
called an inversion, then the values are
swapped
• 3562891

The Bubble Sort
• This is the “bubbling” effect that gives
the bubble sort its name
• This process is continued until the list is
sorted
• The more inversions in the list, the longer
it takes to sort

"Bubbling Up" the Largest Element
• Traverse a collection of elements
– Move from the front to the end
– “Bubble” the largest value to the end using
pair-wise comparisons and swapping
1 2 3 4 5 6
77 42 35 12 101 5

– “Bubble” the largest value to the end using pair-
wise comparisons and swapping
1 2 3 4 5 6
42 Swap 42
77 77 35 12 101 5

1 2 3 4 5 6
42 7735 Swap 35
77 12 101 5

1 2 3 4 5 6
42 35 12 Swap 12
77 77 101 5

1 2 3 4 5 6
42 35 12 77 101 5
No need to swap

1 2 3 4 5 6
42 35 12 77 5 Swap 101
101 5

1 2 3 4 5 6
42 35 12 77 5 101
Largest value correctly placed

The “Bubble Up” Algorithm
index <- 1
last_compare_at <- n – 1
loop
exitif(index > last_compare_at)
if(A[index] > A[index + 1]) then
Swap(A[index], A[index + 1])
endif
index <- index + 1
endloop
Items of Interest
• Notice that only the largest value is correctly
placed
• All other values are still out of order
• So we need to repeat this process
1 2 3 4 5 6
42 35 12 77 5 101
Largest value correctly placed

Repeat “Bubble Up” How Many Times?
• If we have N elements…
• And if each time we bubble an element, we

place it in its correct location…
• Then we repeat the “bubble up” process N –

1 times.
• This guarantees we’ll correctly

place all N elements.

“Bubbling” All the Elements
1 2 3 4 5 6
42 35 12 77 5 101
1 2 3 4 5 6
35 12 42 5 77 101
1 2 3 4 5 6
N-1
12 35 5 42 77 101
1 2 3 4 5 6
12 5 35 42 77 101
1 2 3 4 5 6
5 12 35 42 77 101

Reducing the Number of Comparisons
1 2 3 4 5 6
77 42 35 12 101 5
1 2 3 4 5 6
42 35 12 77 5 101
1 2 3 4 5 6
35 12 42 5 77 101
1 2 3 4 5 6
12 35 5 42 77 101
1 2 3 4 5 6
12 5 35 42 77 101

Reducing the Number of Comparisons
• On the Nth “bubble up”, we only need to
do MAX-N comparisons.
• For example:
– This is the 4th “bubble up”
– MAX is 6
– Thus we have 2 comparisons to do
1 2 3 4 5 6
12 35 5 42 77 101

procedure Bubblesort(A isoftype in/out Arr_Type)
to_do, index isoftype Num
to_do <- N – 1
loop
exitif(to_do = 0)
index <- 1
loop
exitif(index > to_do)
if(A[index] > A[index + 1]) then
Outer loop
Inner loop
Swap(A[index], A[index + 1])
endif
index <- index + 1
endloop
to_do <- to_do - 1
endloop
endprocedure // Bubblesort

Best Case of the Bubble Sort
• Let X: number of interchanges (discrete).

• Consider list of n elements already sorted
x1 < x2 < x3 < … < xn

Best Case (cont.)
• Only have to run through the list once
since it is already in order, thus giving
you n-1 comparisons and X = 0.
• Therefore, the time complexity of the best
case scenario is O(n).

Worst Case of the Bubble Sort
• Let X: number of interchanges

(discrete).
• Consider list of n elements in
descending order
x1 > x2 > x3 > … > xn.

Worst Case (cont.)
• # of comparisons = # of interchanges for

each pass.
• X = (n-1) + (n-2) + (n-3) + … + 2 + 1
• This is an arithmetic series.

Worst Case (cont.)
• (n-1) + (n-2) + … + 2 + 1 = n(n-1)/2

• So X = n(n-1)/2.
• Therefore, the time complexity of the
worst case scenario is O(n2).

Advantages and disadvantages of Bubble
sort
Advantages:
• Popular and easy to implement
• Elements are swapped in place without using
additional temporary storage, so the space
requirements is at minimum
Disadvantages:
• Does not deal well with a list containing a
huge number of items
Mock Questions
• Compute the analysis of the algorithm to find the
largest element in an array.
• Compute the analysis of the algorithm to find the
unique element in an array.
• Compute the analysis of factorial of n nos.
• Compute the analysis of selection sort.
• Compute the analysis of merge sort.
• Compute the analysis of quick sort.
• Compute the analysis of shell sort

15Cs201J-Data Structures: Unit-I

Uploaded by

Copyright:

Available Formats

15Cs201J-Data Structures: Unit-I

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

15Cs201J-Data Structures: Unit-I

Uploaded by

Copyright:

Available Formats

15CS201J-DATA STRUCTURES

DEPARTMENT OF SOFTWARE ENGINEERING

DEPARTMENT OF SOFTWARE ENGINEERING

DEPARTMENT OF SOFTWARE ENGINEERING

• Data are simply values or sets of values.

DEPARTMENT OF SOFTWARE ENGINEERING

DEPARTMENT OF SOFTWARE ENGINEERING

DEPARTMENT OF SOFTWARE ENGINEERING

• Manipulation of data for need

DEPARTMENT OF SOFTWARE ENGINEERING

Primitive Data Structure Non – Primitive/User Defined

Integer Float Character Boolean Linear Data Non - Linear

Arrays Stack Queues Linked List Trees Graphs

DEPARTMENT OF SOFTWARE ENGINEERING

DEPARTMENT OF SOFTWARE ENGINEERING

DEPARTMENT OF SOFTWARE ENGINEERING

record may be processed.

3. Inserting: Adding a new record to the structure.

4. Deleting: Removing a record from the structure.

5. Sorting: Arranging the records in some logical order.

sorted file.DEPARTMENT OF SOFTWARE ENGINEERING

Input “Computer” Output

DEPARTMENT OF SOFTWARE ENGINEERING

DEPARTMENT OF SOFTWARE ENGINEERING

DEPARTMENT OF SOFTWARE ENGINEERING

DEPARTMENT OF SOFTWARE ENGINEERING

n2 quadratic Two embedded loops, e.g.,

2n exponential All subsets of n-elements

Search for a key in a # of items in the list Key comparison

Add two n×n matrices Dimensions of the Addition

Polynomial evaluation Order of the Multiplication

DEPARTMENT OF SOFTWARE ENGINEERING

DEPARTMENT OF SOFTWARE ENGINEERING

DEPARTMENT OF SOFTWARE ENGINEERING

DEPARTMENT OF SOFTWARE ENGINEERING

DEPARTMENT OF SOFTWARE ENGINEERING

DEPARTMENT OF SOFTWARE ENGINEERING

• The complexity of an algorithm M is the function

f(n) which gives the running time and/or storage

• Two types of complexity

DEPARTMENT OF SOFTWARE ENGINEERING

An investigation of an algorithm’s efficiency with respect

DEPARTMENT OF SOFTWARE ENGINEERING

Memory space S(P) needed by a program P, consists of two components:

 For constant no of variables (constant space complexity) –O(1)

 For an array of n elements ( linear space complexity) – O(n)

DEPARTMENT OF SOFTWARE ENGINEERING

DEPARTMENT OF SOFTWARE ENGINEERING

DEPARTMENT OF SOFTWARE ENGINEERING

DEPARTMENT OF SOFTWARE ENGINEERING

# of times basic op.

DEPARTMENT OF SOFTWARE ENGINEERING

DEPARTMENT OF SOFTWARE ENGINEERING

DEPARTMENT OF SOFTWARE ENGINEERING

DEPARTMENT OF SOFTWARE ENGINEERING

Why Cworst(n) is better than

• To solve a recurrence relation T(n) we need to derive a form of

public int binarySearch (int target, int[] array,

The base case is reached when n / 2k = 1  n = 2k  k = log2 n, we then