Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
2 views

Chapter 05_ Sorting and Searching Algorithms

The document discusses sorting and searching algorithms, focusing on their analysis, running times, and comparisons. It covers various algorithms such as insertion sort, mergesort, and binary search, detailing their performance in best, worst, and average cases. Additionally, it highlights the importance of sorting for efficient searching and provides examples of algorithm implementations.

Uploaded by

nigatudebebe3
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Chapter 05_ Sorting and Searching Algorithms

The document discusses sorting and searching algorithms, focusing on their analysis, running times, and comparisons. It covers various algorithms such as insertion sort, mergesort, and binary search, detailing their performance in best, worst, and average cases. Additionally, it highlights the importance of sorting for efficient searching and provides examples of algorithm implementations.

Uploaded by

nigatudebebe3
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 61

Center of Biomedical Engineering

Data Structures and Algorithms


(BMED – 3112)

Sorting and Searching Algorithms

Oct, 2024
Contents

•Algorithm analysis
•Running time
•Comparision in insertion and
merging
•Mergesort and its comparison
•Sorting
•Searching: Sequential search and
binary search
2
Algorithm analysis

• Determine the amount of resources an algorithm requires to run


– computation time, space in memory
• Running time of an algorithm is the number of basic operations performed
– Additions, multiplications, comparisons
– Usually grows with the size of the input
– Faster to add 2 numbers than to add 2,000,000!
Running time

• Worst-case running time


– upper bound on the running time
– guarantee the algorithm will never take longer to run
• Average-case running time
– time it takes the algorithm to run on average (expected value)
• Best-case running time
– lower bound on the running time
– guarantee the algorithm will not run faster
Sorting and Searching
8Fundamental problems in computer science and
programming.
8Sorting done to make searching easier.
8Multiple different algorithms to solve the same problem
– How do we know which algorithm is "better"?
8Look at searching first
8Examples will use arrays of ints to illustrate algorithms

Sorting and Searching 5


Comparisons in insertion sort

• Worst case
– element k requires (k-1) comparisons
– total number of comparisons:
0+1+2+ … + (n-1) = ½ (n)(n-1)
= ½ (n2-n)
• Best case
– elements 2 through n each require one comparison
– total number of comparisons:
1+1+1+ … + 1 = n-1

(n-1) times
Running time of insertion sort

• Best case running time is linear


• Worst case running time is quadratic
• Average case running time is also quadratic
– on average element k requires (k-1)/2 comparisons
– total number of comparisons:
½ (0+1+2+ … + n-1) = ¼ (n)(n-1)
= ¼ (n2-n)
Mergesort

27 10 12 20
divide
27 10 12 20
divide divide
27 10 12 20
merge merge

10 27 12 20
merge
10 12 20 27
Merging two sorted lists

first list second list result of merge

10 27 12 20 10

10 27 12 20 10 12

10 27 12 20 10 12 20

10 27 12 20 10 12 20 27
Comparisons in merging

• Merging two sorted lists of size m requires at least m and at most 2m-1
comparisons
– m comparisons if all elements in one list are smaller than all elements in
the second list
– 2m-1 comparisons if the smallest element alternates between lists
Logarithm

• Power to which any other number a must be raised to produce n


– a is called the base of the logarithm
• Frequently used logarithms have special symbols
– lg n = log2 n logarithm base 2
– ln n = loge n natural logarithm (base e)
– log n = log10 n common logarithm (base 10)
• If we assume n is a power of 2, then the number of times we can recursively
divide n numbers in half is lg n
Comparisons at each merge

#lists #elements #merges #comparisons #comparisons


in each list per merge total
n 1 n/2 1 n/2

n/2 2 n/4 3 3n/4

n/4 4 n/8 7 7n/8

2 n/2 1 n-1 n-1


Comparisons in mergesort

• Total number of comparisons is the sum of the number of comparisons made


at each merge
– at most n comparisons at each merge
– the number of times we can recursively divide n numbers in half is lg n, so
there are lg n merges
– there are at most n lg n comparisons total
Comparison of sorting algorithms

• Best, worst and average-case running time of mergesort is O(n log n).
– This consistency is due to the algorithm always dividing the array into two
halves and merging them, regardless of the initial order of elements.
• Compare to average case behavior of insertion sort:

n Insertion sort Mergesort


10 25 33
100 2500 664
1000 250000 9965
10000 25000000 132877
100000 2500000000 1660960
Quicksort just the fastest searching aligorthms

• Most commonly used sorting algorithm


• One of the fastest sorts in practice
• Best and average-case running time is O(n lg n)
• Worst-case running time is quadratic
• Runs very fast on most computers when implemented correctly
Searching
Searching just find the set of particular element in a given datastructure

• Determine the location or existence of an element in a collection of elements


of the same type
• Easier to search large collections when the elements are already sorted
– finding a phone number in the phone book
– looking up a word in the dictionary
• What if the elements are not sorted?
Sequential search

• Given a collection of n unsorted elements, compare each element in


sequence
• Worst-case: Unsuccessful search
– search element is not in input
– make n comparisons
– search time is linear
• Average-case:
– expect to search ½ the elements
– make n/2 comparisons
– search time is linear
Searching sorted input

• If the input is already sorted, we can search more efficiently than linear time
• Example: “Higher-Lower”
– think of a number between 1 and 1000
– have someone try to guess the number
– if they are wrong, you tell them if the number is higher than their guess or
lower
• Strategy?
• How many guesses should we expect to make?
Best Strategy

• Always pick the number in the middle of the range


• Why?
– you eliminate half of the possibilities with each guess

• We should expect to make at most


lg1000  10 guesses

• Binary search
– search n sorted inputs in logarithmic time
Binary search

• Search for 9 in a list of 16 elements

1 3 4 5 5 7 9 10 11 13 14 18 20 22 23 30

1 3 4 5 5 7 9 10

5 7 9 10

9 10

9
Sequential vs. binary search

• Average-case running time of sequential search is linear


• Average-case running time of binary search is logarithmic
• Number of comparisons:

n sequential binary
search search
2 1 1
16 8 4
256 128 8
4096 2048 12
65536 32768 16
Searching

Sorting and Searching 23


Searching
8Given a list of data find the location of a particular value or
report that value is not present
8linear search
– intuitive approach
– start at first item
– is it the one I am looking for?
– if not go to next item
– repeat until found or all items checked
8If items not sorted or unsortable this approach is necessary

Sorting and Searching 24


Linear Search
/* pre: list != null
post: return the index of the first occurrence
of target in list or -1 if target not present in
list
*/
public int linearSearch(int[] list, int target) {
for(int i = 0; i < list.length; i++)
if( list[i] == target )
return i;
return -1;
}

Sorting and Searching 25


Attendance Question 1
8What is the average case Big O of linear search in an
array with N items, if an item is present?
A. O(N)
B. O(N2) On average, the algorithm has a Big-O runtime of
C. O(1) O(N), even though the average number of
comparisons for a search that runs only halfway
D. O(logN) through the list is N/2.
E. O(NlogN)

Sorting and Searching 26


Searching in a Sorted List
8If items are sorted then we can divide and
conquer
8dividing your work in half with each step
– generally a good thing
8The Binary Search on List in Ascending order
– Start at middle of list
– is that the item?
– If not is it less than or greater than the item?
– less than, move to second half of list
– greater than, move to first half of list
– repeat until found or sub list size = 0
Sorting and Searching 27
Binary Search
list

low item middle item high item


Is middle item what we are looking for? If not is it
more or less than the target item? (Assume lower)

list

low middle high


item item item
and so forth…
Sorting and Searching 28
Binary Search in Action
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53
public static int bsearch(int[] list, int target)
{ int result = -1;
int low = 0;
int high = list.length - 1;
int mid;
while( result == -1 && low <= high )
{ mid = low + ((high - low) / 2);
if( list[mid] == target )
result = mid;
else if( list[mid] < target)
low = mid + 1;
else
high = mid - 1;
}
return result;
}
// mid = ( low + high ) / 2; // may overflow!!!
// or mid = (low + high) >>> 1; using bitwise op
Sorting and Searching 29
Attendance Question 2

What is the worst case Big O of binary search in an array


with N items, if an item is present?
A. O(N)
B. O(N2)
The worst-case time complexity is O(log N) . This means that as the
C. O(1) number of values in a dataset increases, the performance time of the
algorithm (the number of comparisons) increases as a function of the
D. O(logN) base-2 logarithm of the number of values.

E. O(NlogN)

Sorting and Searching 30


Generic Binary Search
public static int bsearch(Comparable[] list, Comparable target)
{ int result = -1;
int low = 0;
int high = list.length - 1;
int mid;
while( result == -1 && low <= high )
{ mid = low + ((high - low) / 2);
if( target.equals(list[mid]) )
result = mid;
else if(target.compareTo(list[mid]) > 0)
low = mid + 1;
else
high = mid - 1;
}
return result;
}

Sorting and Searching 31


Recursive Binary Search
public static int bsearch(int[] list, int target){
return bsearch(list, target, 0, list.length – 1);
}

public static int bsearch(int[] list, int target,


int first, int last){
if( first <= last ){
int mid = low + ((high - low) / 2);
if( list[mid] == target )
return mid;
else if( list[mid] > target )
return bsearch(list, target, first, mid – 1);
else
return bsearch(list, target, mid + 1, last);
}
return -1;
}

Sorting and Searching 32


Other Searching Algorithms (reading assigment)
8 Interpolation Search
– more like what people really do
8 Indexed Searching
8 Binary Search Trees
8 Hash Table Searching
8 Grover's Algorithm (Waiting for
quantum computers to be built)
8 best-first
8 A*

Sorting and Searching 33


Sorting

Sorting and Searching 34


Sorting Fun
Why Not Bubble Sort?

Sorting and Searching 35


Sorting
8A fundamental application for computers
8Done to make finding data (searching) faster
8Many different algorithms for sorting
8One of the difficulties with sorting is working with a
fixed size storage container (array)
– if resize, that is expensive (slow)
8The "simple" sorts run in quadratic time O(N2)
– bubble sort
– selection sort
– insertion sort

Sorting and Searching 36


Stable Sorting
8A property of sorts
8If a sort guarantees the relative order of equal items stays
the same then it is a stable sort.
8[71, 6, 72, 5, 1, 2, 73, -5]
– subscripts added for clarity
8[-5, 1, 2, 5, 6, 71, 72, 73]
– result of stable sort
8Real world example:
– sort a table in Wikipedia by one criteria, then another
– sort by country, then by major wins

Sorting and Searching 37


8Algorithm
Selection sort
– Search through the list and find the smallest element
– swap the smallest element with the first element
– repeat starting at second element and find the second
smallest element
public static void selectionSort(int[] list)
{ int min;
int temp;
for(int i = 0; i < list.length - 1; i++) {
min = i;
for(int j = i + 1; j < list.length; j++)
if( list[j] < list[min] )
min = j;
temp = list[i];
list[i] = list[min];
list[min] = temp;
}
}
Sorting and Searching 38
Selection Sort in Practice
44 68 191 119 119 37 83 82 191 45 158 130 76 153 39 25

What is the T(N), actual number of statements executed, of


the selection sort code, given a list of N elements?
What is the Big O?

Selection sort has a time complexity of O(n²), where n is the number of


elements in the array.
For each n element, the algorithm makes n-1 comparisons to find the
minimum element.

Sorting and Searching 39


Generic Selection Sort
public void selectionSort(Comparable[] list)
{ int min; Comparable temp;
for(int i = 0; i < list.length - 1; i++) {
{ min = i;
for(int j = i + 1; j < list.length; j++)
if( list[min].compareTo(list[j]) > 0 )
min = j;
temp = list[i];
list[i] = list[min];
list[min] = temp;
}
}
Best case, worst case, average case Big O?
• Selection Sort uses the selection method and performs at O(n2) in the best, average,
and worst case. 40
Attendance Question 3
Is selection sort always stable?
A. Yes
B. No

Stability:
• Selection Sort is not stable, so equal elements may not stay in
the same order after sorting.

Sorting and Searching 41


Insertion Sort
8Another of the O(N2) sorts
8The first item is sorted
8Compare the second item to the first
– if smaller swap
8Third item, compare to item next to it
– need to swap
– after swap compare again
8And so forth…

Sorting and Searching 42


Insertion Sort Code
public void insertionSort(int[] list)
{ int temp, j;
for(int i = 1; i < list.length; i++)
{ temp = list[i];
j = i;
while( j > 0 && temp < list[j - 1])
{ // swap elements
list[j] = list[j - 1];
list[j - 1] = temp;
j--;
}
}
}
8Best case, worst case, average case Big O?
• O(n) in the best case, it performs at O(n2) in the average and worst case.
43
Attendance Question 4
8Is the version of insertion sort shown always stable?
A. Yes
B. No

Yes, Insertion Sorting is a stable sorting.


As Stable sorting simply means the relative order of the
duplicate elements should not change.

Sorting and Searching 44


Comparing Algorithms
8Which algorithm do you think will be faster given random
data, selection sort or insertion sort?
8Why?

• Insertion Sort generally performs better in practice, especially on


nearly sorted data, due to fewer swaps and comparisons, making it
more efficient in average scenarios compared to Bubble Sort and
Selection Sort.

Sorting and Searching 45


Sub Quadratic
Sorting Algorithms
Sub Quadratic means having a Big O better
than O(N2)

Next Session!

Sorting and Searching 46


ShellSort
8Created by Donald Shell in 1959
8Wanted to stop moving data small distances (in the case of
insertion sort and bubble sort) and stop making swaps that
are not helpful (in the case of selection sort).
8Start with sub arrays created by looking at data that is far
apart and then reduce the gap size.

Sorting and Searching 47


ShellSort in practice
46 2 83 41 102 5 17 31 64 49 18
Gap of five. Sort sub array with 46, 5, and 18
5 2 83 41 102 18 17 31 64 49 46
Gap still five. Sort sub array with 2 and 17
5 2 83 41 102 18 17 31 64 49 46
Gap still five. Sort sub array with 83 and 31
5 2 31 41 102 18 17 83 64 49 46
Gap still five Sort sub array with 41 and 64
5 2 31 41 102 18 17 83 64 49 46
Gap still five. Sort sub array with 102 and 49
5 2 31 41 49 18 17 83 64 102 46
Sorting and Searching 48
Completed Shellsort
5 2 31 41 49 18 17 83 64 102 46
Gap now 2: Sort sub array with 5 31 49 17 64 46
5 2 17 41 31 18 46 83 49 102 64
Gap still 2: Sort sub array with 2 41 18 83 102
5 2 17 18 31 41 46 83 49 102 64
Gap of 1 (Insertion sort)
2 5 17 18 31 41 46 49 64 83 102

Array sorted

Sorting and Searching 49


Shellsort on Another Data Set
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
44 68 191 119 119 37 83 82 191 45 158 130 76 153 39 25

Initial gap = length / 2 = 16 / 2 = 8


initial sub arrays indices:
{0, 8}, {1, 9}, {2, 10}, {3, 11}, {4, 12}, {5, 13}, {6, 14}, {7, 15}
next gap = 8 / 2 = 4
{0, 4, 8, 12}, {1, 5, 9, 13}, {2, 6, 10, 14}, {3, 7, 11, 15}
next gap = 4 / 2 = 2
{0, 2, 4, 6, 8, 10, 12, 14}, {1, 3, 5, 7, 9, 11, 13, 15}
final gap = 2 / 2 = 1

50
ShellSort Code
public static void shellsort(Comparable[] list)
{ Comparable temp; boolean swap;
for(int gap = list.length / 2; gap > 0; gap /= 2)
for(int i = gap; i < list.length; i++)
{ Comparable tmp = list[i];
int j = i;
for( ; j >= gap &&
tmp.compareTo( list[j - gap] ) < 0;
j -= gap )
list[ j ] = list[ j - gap ];
list[ j ] = tmp;
}
}

Sorting and Searching 51


Comparison of Various Sorts
Num Items Selection Insertion Shellsort Quicksort
1000 16 5 0 0
2000 59 49 0 6
4000 271 175 6 5
8000 1056 686 11 0
16000 4203 2754 32 11
32000 16852 11039 37 45
64000 expected? expected? 100 68
128000 expected? expected? 257 158
256000 expected? expected? 543 335
512000 expected? expected? 1210 722
1024000 expected? expected? 2522 1550

times in milliseconds 52
Quicksort
8 Invented by C.A.R. (Tony) Hoare
8 A divide and conquer approach
that uses recursion
1. If the list has 0 or 1 elements it is sorted
2. otherwise, pick any element p in the list. This is called
the pivot value
3. Partition the list minus the pivot into two sub lists
according to values less than or greater than the pivot.
(equal values go to either)
4. return the quicksort of the first list followed by the
quicksort of the second list
Sorting and Searching 53
Quicksort in Action
39 23 17 90 33 72 46 79 11 52 64 5 71
Pick middle element as pivot: 46
Partition list
23 17 5 33 39 11 46 79 72 52 64 90 71
quick sort the less than list
Pick middle element as pivot: 33
23 17 5 11 33 39
quicksort the less than list, pivot now 5
{} 5 23 17 11
quicksort the less than list, base case
quicksort the greater than list
Pick middle element as pivot: 17
and so on….

Sorting and Searching 54


Attendance Question 5
8What is the best case and worst case Big O of quicksort?
Best Worst
A. O(NlogN) O(N2)
• The big-O notation of the quicksort
B. O(N )
2 O(N )
2
algorithm is O(n log n) in the best
C. O(N2) O(N!) case and average case, and O(n2) in
the worst case.
D. O(NlogN) O(NlogN) • Quicksort is a popular sorting
E. O(N) O(NlogN) algorithm that uses the divide-and-
conquer strategy to sort elements.

Sorting and Searching 55


Attendance Question 6
8You have 1,000,000 items that you will be searching.
How many searches need to be performed before the
data is changed to make sorting worthwhile?
A. 10
B. 40
C. 1,000
D. 10,000
E. 500,000

Sorting and Searching 56


Merge Sort Algorithm
Don Knuth cites John von Neumann as the creator
of this algorithm
1. If a list has 1 element or 0 elements
it is sorted
2. If a list has more than 2 split into into
2 separate lists
3. Perform this algorithm on each of
those smaller lists
4. Take the 2 sorted lists and merge
them together
Sorting and Searching 57
Merge Sort

When implementing one


temporary array is used instead
of multiple temporary arrays.

Why?

Sorting and Searching 58


Merge Sort code
/**
* perform a merge sort on the data in c
* @param c c != null, all elements of c
* are the same data type
*/
public static void mergeSort(Comparable[] c)
{ Comparable[] temp = new Comparable[ c.length ];
sort(c, temp, 0, c.length - 1);
}

private static void sort(Comparable[] list, Comparable[] temp,


int low, int high)
{ if( low < high){
int center = (low + high) / 2;
sort(list, temp, low, center);
sort(list, temp, center + 1, high);
merge(list, temp, low, center + 1, high);
}
} Sorting and Searching 59
Merge Sort Code
private static void merge( Comparable[] list, Comparable[] temp,
int leftPos, int rightPos, int rightEnd){
int leftEnd = rightPos - 1;
int tempPos = leftPos;
int numElements = rightEnd - leftPos + 1;
//main loop
while( leftPos <= leftEnd && rightPos <= rightEnd){
if( list[ leftPos ].compareTo(list[rightPos]) <= 0){
temp[ tempPos ] = list[ leftPos ];
leftPos++;
}
else{
temp[ tempPos ] = list[ rightPos ];
rightPos++;
}
tempPos++;
}
//copy rest of left half
while( leftPos <= leftEnd){
temp[ tempPos ] = list[ leftPos ];
tempPos++;
leftPos++;
}
//copy rest of right half
while( rightPos <= rightEnd){
temp[ tempPos ] = list[ rightPos ];
tempPos++;
rightPos++;
}
//Copy temp back into list
for(int i = 0; i < numElements; i++, rightEnd--)
list[ rightEnd ] = temp[ rightEnd ];
} Sorting and Searching 60
Thank you!!

Next Session:
Chapter 06:

61

You might also like