Sorting and Algorithm Analysis
Sorting and Algorithm Analysis
• Ground rules:
• sort the values in increasing order
• sort “in place,” using only a small amount of additional storage
• Terminology:
• position: one of the memory locations in the array
• element: one of the data items stored in the array
• element i: the element at position i
stack heap
arr
15 7 ...
0 1
1 2 3 4
2 6 15 12 44
0 1 2
2 3 4 0 1 2 3
3 4
2 4 15 12 66 2 4 6 12 15
Selecting an Element
• When we consider position i, the elements in positions
0 through i – 1 are already in their final positions.
0 1 2 3 4 5 6
example for i = 3: 2 4 7 21 25 10 17
Time Analysis
• Some algorithms are much more efficient than others.
(n - 1)((n - 1) 1)
=
2
(n - 1)n 2
= C(n) = n 2 - n 2
2
Focusing on the Largest Term
• When n is large, mathematical expressions of n are dominated
by their “largest” term — i.e., the term that grows fastest as a
function of n.
• example: n n2/2 n/2 n2/2 – n/2
10 50 5 45
100 5000 50 4950
10000 50,000,000 5000 49,995,000
Big-O Notation
• We specify the largest term using big-O notation.
• e.g., we say that C(n) = n2/2 – n/2 is O(n2)
140
120
100 n^2
n log n
80
n
60 log n
40
20
0
0 1 2 3 4 5 6 7 8 9 10 11 12
n
• Moves: after each of the n-1 passes, the algorithm does one swap.
• n-1 swaps, 3 moves per swap
• M(n) = 3(n-1) = 3n-3
• selection sort performs O(n) moves.
g(n) = n2
n
• Big-O notation specifies an upper bound on a function f(n)
as n grows large.
Big-O Notation and Tight Bounds
• Strictly speaking, big-O notation provides an upper bound,
not a tight bound (upper and lower).
• Example:
• 3n – 3 is O(n2) because 3n – 3 <= n2 for all n >= 1
• 3n – 3 is also O(2n) because 3n – 3 <= 2n for all n >= 1
Insertion Sort
• Basic idea:
• going from left to right, “insert” each element into its proper
place with respect to the elements to its left
• “slide over” other elements to make room
• Example:
0 1 2 3 4
15 4 2 12 6
4 15 2 12 6
2 4 15 12 6
2 4 12 15 66
2 4 6 12 15
Comparing Selection and Insertion Strategies
• In selection sort, we start with the positions in the array and
select the correct elements to fill them.
• In insertion sort, we start with the elements and determine
where to insert them in the array.
• Here’s an example that illustrates the difference:
0 1 2 3 4 5 6
18 12 15 9 25 2 17
• Sorting by selection:
• consider position 0: find the element (2) that belongs there
• consider position 1: find the element (9) that belongs there
• …
• Sorting by insertion:
• consider the 12: determine where to insert it
• consider the 15; determine where to insert it
• …
Inserting an Element
• When we consider element i, elements 0 through i – 1
are already sorted with respect to each other.
0 1 2 3 4
example for i = 3: 6 14 19 9 …
• To insert element i:
• make a copy of element i, storing it in the variable toInsert:
0 1 2 3
toInsert 9 6 14 19 9
int j = i;
do {
arr[j] = arr[j-1];
j = j - 1;
} while (j > 0 && toInsert < arr[j-1]);
arr[j] = toInsert;
}
}
}
}
Time Analysis of Insertion Sort
• The number of operations depends on the contents of the array.
• best case: array is sorted
• each element is only compared to the element to its left
• we never execute the do-while loop!
• C(n) =_______, M(n) = _______, running time = ______
also true if array
• worst case: array is in reverse order is almost sorted
• each element is compared to all of the elements to its left:
arr[1] is compared to 1 element (arr[0])
arr[2] is compared to 2 elements (arr[0] and arr[1])
…
arr[n-1] is compared to n-1 elements
• C(n) = 1 + 2 + … + (n - 1) = _______
• similarly, M(n) = ______, running time = _______
• average case: elements are randomly arranged
• on average, each element is compared to half
of the elements to its left
• still get C(n) = M(n) = _______, running time = _______
Shell Sort
• Developed by Donald Shell
• Shell sort uses larger moves that allow elements to quickly get
close to where they belong in the sorted array.
Sorting Subarrays
• Basic idea:
• use insertion sort on subarrays that contain elements
separated by some increment incr
• increments allow the data items to make larger “jumps”
• repeat using a decreasing sequence of increments
• Example for an initial increment of 3:
0 1 2 3 4 5 6 7
36 18 10 27 3 20 9 8
• three subarrays:
1) elements 0, 3, 6 2) elements 1, 4, 7 3) elements 2 and 5
27 3 10 36 18 20 9 8
27 3 10 36 18 20 9 8
9 3 10 27 18 20 36 8
9 3 10 27 8 20 36 18
Inserting an Element in a Subarray
• When we consider element i, the other elements in its subarray
are already sorted with respect to each other.
0 1 2 3 4 5 6 7
example for i = 6:
(incr = 3) 27 3 10 36 18 20 9 8
the other element’s in 9’s subarray (the 27 and 36)
are already sorted with respect to each other
• To insert element i:
• make a copy of element i, storing it in the variable toInsert:
0 1 2 3 4 5 6 7
toInsert 9 27 3 10 36 18 20 9 8
• Our version uses values that are one less than a power of two.
• 2k – 1 for some k
• … 63, 31, 15, 7, 3, 1
• can get to the next lower increment using integer division:
incr = incr/2;
int j = i;
do {
arr[j] = arr[j-incr];
j = j - incr;
} while (j > incr-1 &&
toInsert < arr[j-incr]);
Bubble Sort
• Perform a sequence of passes from left to right
• each pass swaps adjacent elements if they are out of order
• larger elements “bubble up” to the end of the array
• Example: 0 1 2 3 4
28 24 37 15 5
after the first pass: 24 28 15 5 37
after the second: 24 15 5 28 37
after the third: 15 5 24 28 37
after the fourth: 5 15 24 28 37
Implementation of Bubble Sort
public class Sort {
...
public static void bubbleSort(int[] arr) {
for (int i = arr.length - 1; i > 0; i--) {
for (int j = 0; j < i; j++) {
if (arr[j] > arr[j+1]) {
swap(arr, j, j+1);
}
}
}
}
}
• Nested loops:
• the inner loop performs a single pass
• the outer loop governs:
• the number of passes (arr.length - 1)
• the ending point of each pass (the current value of i)
• M(n) =
• in the best case:
• Running time:
• C(n) is always O(n2), M(n) is never worse than O(n2)
• therefore, the largest term of C(n) + M(n) is O(n2)
Quicksort
• Like bubble sort, quicksort uses an approach based on swapping
out-of-order elements, but it’s more efficient.
12 8 14 4 6 13 6 8 4 14 12 13
7 15 4 9 6 18 9 12
partition using a pivot of 9
7 9 4 6 9 18 15 12
all values <= 9 all values >= 9
4 8 14 12 6 18 4 8 14 12 6 18
pivot = 18
i j
• Find: 4 5 2 13 18 24 20 19
and now the indices are equal, so we return j.
i j
• Subarrays: 4 5 2 13 18 24 20 19
• Find: 4 14 7 5 2 19 26 6
Partitioning Example 4
• Start i j
(pivot = 15): 8 10 7 15 20 9 6 18
• Find: 8 10 7 15 20 9 6 18
split
first (j) last
… 7 9 4 6 9 18 15 12 …
1 1 1 1 1 1 … 1 1 1 1 0
1 n-1 n-1
1 n-2 n-2
1 n-3 n-3
....... ...
1 2 2
1 1
n
• C(n) = i = O(n2). M(n) and run time are also O(n2).
i 2
Mergesort
• The algorithms we've seen so far have sorted the array in place.
• use only a small amount of additional memory
2 5 7 8 9 11 14 24
5 7 9 11
Merging Sorted Arrays
• To merge sorted arrays A and B into an array C, we maintain
three indices, which start out on the first elements of the arrays:
i
A 2 8 14 24 k
j C
B 5 7 9 11
12 8 14 4 6 33 2 27
split 12 8 14 4 6 33 2 27
split 12 8 14 4 6 33 2 27
split 12 8 14 4 6 33 2 27
merge 8 12 4 14 6 33 2 27
merge 4 8 12 14 2 6 27 33
merge 2 4 6 8 12 14 27 33
12 8 14 4 6 33 2 27
split into two 4-element subarrays, and make a recursive call to sort the left subarray:
12 8 14 4 6 33 2 27
12 8 14 4
split into two 2-element subarrays, and make a recursive call to sort the left subarray:
12 8 14 4 6 33 2 27
12 8 14 4
12 8
Tracing the Calls to Mergesort
split into two 1-element subarrays, and make a recursive call to sort the left subarray:
12 8 14 4 6 33 2 27
12 8 14 4
12 8
12
base case, so return to the call for the subarray {12, 8}:
12 8 14 4 6 33 2 27
12 8 14 4
12 8
12 8 14 4
12 8
base case, so return to the call for the subarray {12, 8}:
12 8 14 4 6 33 2 27
12 8 14 4
12 8
Tracing the Calls to Mergesort
merge the sorted halves of {12, 8}:
12 8 14 4 6 33 2 27
12 8 14 4
12 8 8 12
end of the method, so return to the call for the 4-element subarray, which now has
a sorted left subarray:
12 8 14 4 6 33 2 27
8 12 14 4
8 12 14 4
14 4
split it into two 1-element subarrays, and make a recursive call to sort the left subarray:
12 8 14 4 6 33 2 27
8 12 14 4
14 4
14 base case…
Tracing the Calls to Mergesort
return to the call for the subarray {14, 4}:
12 8 14 4 6 33 2 27
8 12 14 4
14 4
12 8 14 4 6 33 2 27
8 12 14 4
14 4
4 base case…
8 12 14 4
14 4
12 8 14 4 6 33 2 27
8 12 14 4
14 4 4 14
Tracing the Calls to Mergesort
end of the method, so return to the call for the 4-element subarray, which now has
two sorted 2-element subarrays:
12 8 14 4 6 33 2 27
8 12 4 14
12 8 14 4 6 33 2 27
8 12 4 14 4 8 12 14
4 8 12 14 6 33 2 27
perform a similar set of recursive calls to sort the right subarray. here's the result:
4 8 12 14 2 6 27 33
finally, merge the sorted 4-element subarrays to get a fully sorted 8-element array:
4 8 12 14 2 6 27 33
2 4 6 8 12 14 27 33
Implementing Mergesort
• In theory, we could create new arrays for each new pair of
subarrays, and merge them back into the array that was split.
• Instead, we'll create a temp. array of the same size as the original.
• pass it to each call of the recursive mergesort method
• use it when merging subarrays of the original array:
arr 8 12 4 14 6 33 2 27
temp 4 8 12 14
• after each merge, copy the result back into the original array:
arr 4 8 12 14 6 33 2 27
temp 4 8 12 14
arr: … 4 8 12 14 2 6 27 33 …
temp: … …
start end
arr: … 12 8 14 4 6 33 2 27 …
temp: … …
Methods for Mergesort
• Here's the key recursive method:
private static void mSort(int[] arr, int[] temp, int start, int end){
if (start >= end) { // base case: subarray of length 0 or 1
return;
} else {
int middle = (start + end)/2;
mSort(arr, temp, start, middle);
mSort(arr, temp, middle + 1, end);
merge(arr, temp, start, middle, middle + 1, end);
}
}
1 1 1 1 1 1 … 1 1 1 1
• at all but the last level of the call tree, there are 2n moves
• how many levels are there?
• M(n) = ?
• C(n) = ?
Summary: Sorting Algorithms
algorithm best case avg case worst case extra memory
selection sort O(n2) O(n2) O(n2) O(1)
insertion sort O(n) O (n2) O (n2) O(1)
Shell sort O(n log n) O (n1.5) O (n1.5) O(1)
bubble sort O (n2) O (n2) O (n2) O(1)
quicksort O(n log n) O(n log n) O(n2) best/avg: O(log n)
worst: O(n)
mergesort O(n log n) O(n log n) O(nlog n) O(n)
• O(n2)-time
• O(n3)-time
• O(log2n)-time
• O(2n)-time
How Does the Actual Running Time Scale?
• How much time is required to solve a problem of size n?
• assume that each operation requires 1 sec (1 x 10-6 sec)
time problem size (n)
function 10 20 30 40 50 60
n .00001 s .00002 s .00003 s .00004 s .00005 s .00006 s
n2 .0001 s .0004 s .0009 s .0016 s .0025 s .0036 s
n5 .1 s 3.2 s 24.3 s 1.7 min 5.2 min 13.0 min
2n .001 s 1.0 s 17.9 min 12.7 days 35.7 yrs 36,600 yrs
• sample computations:
• when n = 10, an n2 algorithm performs 102 operations.
102 * (1 x 10-6 sec) = .0001 sec
• when n = 30, a 2n algorithm performs 230 operations.
230 * (1 x 10-6 sec) = 1073 sec = 17.9 min
• sample computations:
• 1 hour = 3600 sec
that's enough time for 3600/(1 x 10-6) = 3.6 x 109 operations
• n2 algorithm:
n2 = 3.6 x 109 n = (3.6 x 109)1/2 = 60,000
• 2 algorithm:
n