week7-3-sort
week7-3-sort
• Recognize the general algorithm and trace code for three algorithms:
selection sort, insertion sort, and merge sort
2
Search Algorithms Benefit from Sorting
We use search algorithms a lot in computer science. Just think of how
many times a day you use Google, or search for a file on your computer.
We've determined that search algorithms work better when the items
they search over are sorted. Can we write an algorithm to sort items
efficiently?
3
Many Ways of Sorting
There are a ton of algorithms that we can use to sort a list.
4
Selection Sort
5
Selection Sort Sorts From Smallest to Largest
The core idea of selection sort is that you sort from smallest to largest.
Note: for selection sort, swapping the element currently in the front position
with the smallest element is faster than sliding all of the numbers down in
the list.
6
Sidebar: Swapping Elements in a List
We'll often need to swap elements in lists as we sort them. Let's implement
swapping first.
To swap two elements, you need to create a temporary variable to hold one
of them. This keeps the first element from getting overwritten.
7
Selection Sort: Repeatedly select the next
smallest and add it to sorted part
i=0
initially UNSORTED
i j
during i smallest
UNSORTED min
loop i SORTED
swap
i
start of i smallest
next loop UNSORTED
SORTED
8
Selection Sort Code
def selectionSort(lst):
# i is the index of the first unsorted element
# everything before it is sorted
for i in range(len(lst)-1):
# find the smallest element
smallestIndex = i
for j in range(smallestIndex + 1, len(lst)):
if lst[j] < lst[smallestIndex]:
smallestIndex = j
swap(lst, i, smallestIndex)
return lst
We'll also talk about individual passes of the sorting algorithms. A pass
is a single iteration of the outer loop (or putting a single element into
its sorted location).
10
Selection Sort Code – Comparisons and Swaps
def selectionSort(lst):
# i is the index of the first unsorted element
# everything before it is sorted
for i in range(len(lst)-1): A single iteration of this is a pass
# find the smallest element
smallestIndex = i
for j in range(smallestIndex + 1, len(lst)):
if lst[j] < lst[smallestIndex]:
smallestIndex = j
Comparison
swap(lst, i, smallestIndex)
return lst Swap
Total comparisons:
(n-1) + (n-2) + ... + 2 + 1 = n * (n-1) / 2 = n2/2 - n/2
12
Selection Sort – Swaps
What about swaps?
The algorithm does a single swap at the end of each pass, and there are
n-1 passes, so there are n-1 swaps.
This is O(n2).
13
Insertion Sort
14
Insertion Sort Builds From the Front
The core idea of insertion sort is to insert each item into a sorted list at
the front.
15
Insertion Sort : repeatedly insert the next
element into the sorted part
i= 1
initially UNSORTED
i
during
loop i SORTED UNSORTED
insert
i
start of
next loop SORTED UNSORTED
16
Insertion Sort Code
def insertionSort(lst):
# i is the index of the first unsorted element
# everything before it is sorted
for i in range(1, len(lst)):
unsorted = i
# compare and swap until unsorted is in the correct place
while unsorted > 0 and lst[unsorted] < lst[unsorted-1]:
swap(lst, unsorted, unsorted-1)
unsorted = unsorted – 1
return lst
19
Insertion Sort – Comparisons and Swaps
In the worst case: For every comparison, we will also make a swap.
Insert 2nd element: 1 comparison & swap
Insert 3rd element: 2 comparisons & swaps
...
Insert last element: n-1 comparisons & swaps
Total actions:
2*(1 + 2 + ... + (n-1)) = 2 * (n * (n-1) / 2) = n2 - n
= O(n2)
20
Sidebar: Insertion Sort Best Case
Why do we care about insertion sort? While its worst case is just as bad
as Selection Sort, its best case is much better!
The best case for insertion sort is an already-sorted list. On this input,
the algorithm does 1 comparison and no swaps on each pass.
21
Merge Sort
22
Improve Efficiency with a Drastic Change
If we want to do better than O(n2), we need to make a drastic change in our
algorithms.
23
Merge Sort Delegates, Then Merges
The core idea of the Merge Sort algorithm is that you sort by merging.
24
Merge Sort Process
84 27 49 91 32 53 63 17
Divide:
84 27 49 91 32 53 63 17
Conquer: (sort)
27 49 84 91 17 32 53 63
Combine: (merge)
17 27 32 49 53 63 84 91
6
Merge Sort Code
def mergeSort(lst):
# base case: 0-1 elements are sorted.
if len(lst) < 2:
return lst
# divide
mid = len(lst) // 2
front = lst[:mid]
back = lst[mid:]
# conquer by sorting
front = mergeSort(front)
back = mergeSort(back)
# combine sorted halves
return merge(front, back)
26
Merge By Checking the Front of the Lists
How do we merge two sorted lists?
27
Merge Code
def merge(front, back):
result = [ ]
i = 0
j = 0
while i < len(front) and j < len(back):
# only compare first two- guaranteed to be smallest due to sorting
if front[i] < back[j]:
result.append(front[i])
i = i + 1
else:
result.append(back[j])
j = j + 1
# add remaining elements (only one still has values)
result = result + front[i:] + back[j:]
return result
28
Merge Sort – Efficiency Analysis
Merge Sort doesn't have swaps. Instead, we'll consider the number of
comparisons and copies that are performed.
What's the worst case input? Any list, really; it doesn't matter.
29
Merge Sort Code
def mergeSort(lst): def merge(front, back):
if len(lst) < 2: result = [ ]
return lst i = 0
mid = len(lst) // 2 j = 0
front = lst[:mid] while i < len(front) and j < len(back):
back = lst[mid:] Copy if front[i] < back[j]: Comparison
front = mergeSort(front) result.append(front[i])
back = mergeSort(back) i = i + 1
return merge(front, back) Copy else:
result.append(back[j])
lst = [2, 4, 1, 5, 10, 8, 3, 6, 7, 9] j = j + 1
lst = mergeSort(lst) result = result + front[i:] + back[j:]
print(lst) return result
Copy 30
Merge Sort Call Breakdown
2 4 1 5 8 3 6 7
2 4 1 5 8 3 6 7
2 4 1 5 8 3 6 7
2 4 1 5 8 3 6 7
2 4 1 5 3 8 6 7
1 2 4 5 3 6 7 8
1 2 3 4 5 6 7 8 31
n copies in each split-pass
n copies + n comparisons in each
Merge Sort Call Breakdown merge-pass
2 4 1 5 8 3 6 7
Split
Pass 1
2 4 1 5 8 3 6 7
Split
Pass 2
2 4 1 5 8 3 6 7 Split
Pass 3
2 4 1 5 8 3 6 7
Merge
Pass 1
2 4 1 5 3 8 6 7
Merge
Pass 2
1 2 4 5 3 6 7 8
Merge
Pass 3
1 2 3 4 5 6 7 8 32
Merge Sort Efficiency
How many split-passes and merge-passes occur?
Every time a pass occurs, we cut the number of elements being sorted
in half. The number of passes is the number of times we can divide
the list in half.
8 28 24 0.85
16 120 64 0.53
32 496 160 0.3
210 523,776 10,240 0.02
220 549,755,289,600 20,971,520 0.00004
34
Comparing Big-O Functions
O(2n) O(n2)
Number of
Operations
O(n log n)
O(n)
O(log n)
O(1)
n
(amount of data)
25
Sidebar: General Sorting Efficiency
In general, the best we can do for sorting efficiency is O(n log n). This is
actually the efficiency of the built-in Python sort!
You can't reduce the time to O(n) unless you put certain restrictions on
the values being sorted.
36
Learning Objectives
• Recognize how different sorting algorithms implement the same
process with different algorithms
• Recognize the general algorithm and trace code for three algorithms:
selection sort, insertion sort, and merge sort
37