Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

CS353 Midterm Slides

Download as pdf or txt
Download as pdf or txt
You are on page 1of 297

‫ر‬

‫الجامعة السعودية االلكتونية‬


‫ر‬
‫االلكتونية‬ ‫الجامعة السعودية‬

‫‪26/12/2021‬‬
College of Computing and Informatics

Design and Analysis of Algorithms


Design and Analysis of Algorithms

Module 1
Fundamental of Algorithms
Week Learning Outcomes

Explain algorithms’ meaning and their role in solving problems

Classify the types of common problems and various types of data structures

4
Topic Outline

➢ The definition of algorithm


➢ Fundamentals of Algorithmic Problem Solving
➢ Important Problem Types

5
The definition of algorithm
What is an algorithm

An algorithm is a sequence of clear instructions


for solving a problem for acquiring a required Problem
output for any legitimate input in a finite
amount of time.

Algorithm

Input
Computer output

7
The Basic blocks of an algorithm

▪ The clear requirement for each step of an


algorithm cannot be compromised.
▪ The careful specification for the range of inputs
Problem
for which an algorithm works for.
▪ Several different ways can be used for
represented the same algorithm.
▪ Several algorithms can be found for solving the Algorithm
same problem. What do distinguish these
algorithms ?
✓ using different ideas , Input
Computer output
✓ different speeds.

8
Characteristics of an algorithm

▪ Input: Zero / more quantities are externally supplied.


▪ Output: At least one quantity is produced.
▪ Definiteness: Each instruction is clear and unambiguous.
▪ Finiteness: If the instructions of an algorithm is traced then for all cases the
algorithm must terminates after a finite number of steps.
▪ Efficiency: Every instruction must be very basic and runs in short time.

9
Steps for writing an algorithm
1. An algorithm is a procedure. It has two parts; the first part is head and the second
part is body.
2. The Head section consists of keyword Algorithm and Name of the algorithm with
parameter list.
E.g. Algorithm name1(p1, p2,…,p3)

The head section also has the following:


//Problem Description:
//Input:
//Output:
3. In the body of an algorithm various programming constructs like if, for, while and
some statements like assignments are used and uses different data structures
likes Arrays, Trees , Lists and Sets .

10
Steps for writing an algorithm

4. The compound statements may be enclosed with { and } brackets. if, for, while
can be closed by endif, endfor, endwhile respectively. Proper indention is must
for block.
5. Comments are written using // at the beginning.
6. The identifier should begin by a letter and not by digit. It contains alpha numeric
letters after first letter. No need to mention data types.
7. The left arrow “←” used as assignment operator. E.g. v←10
8. Boolean operators (TRUE, FALSE), Logical operators (AND, OR, NOT) and
Relational operators (<,<=, >, >=,=, ≠, <>) are also used.
9. Input and Output can be done using read and write.

11
Example
Consider Euclid’s algorithm for computing gcd(m, n) in simple steps

Step 1 If n = 0, return the value of m as the answer and stop; otherwise, proceed to Step 2.
Step 2 Divide m by n and assign the value of the remainder to r.
Step 3 Assign the value of n to m and the value of r to n. Go to Step 1.

Euclid’s algorithm for computing gcd(m, n) expressed in pseudocode


ALGORITHM Euclid_gcd(m, n)
//Computes gcd(m, n) by Euclid’s algorithm
//Input: Two nonnegative, not-both-zero integers m and n
//Output: Greatest common divisor of m and n
while n ≠ 0 do
r ←m mod n
m←n
n←r
12
return m
Fundamentals of Algorithmic Problem Solving
Steps of Designing and Analysis of Algorithms

14
Understanding The problem
This is the first step in designing of algorithm.

In this step :

• Read the problem’s description carefully to understand the problem


statement completely.
• Ask questions for clarifying the doubts about the problem.
• Identify the problem types and use existing algorithm to find solution.
• Input (instance) to the problem and range of the input get fixed.
• Remember that a correct algorithm is not one that works most of the
time, but one that works correctly for all legitimate inputs.
• Skipping the first step will run the risk of unnecessary rework.

15
Decision making

The Decision making is done on the following:

(a) Ascertaining the Capabilities of the Computational Device

• In random-access machine (RAM), instructions are executed one after another


(The central assumption is that one operation at a time)
• Algorithms designed to be executed on such machines are called sequential
algorithms.
• In some newer computers, operations are executed concurrently.
• Algorithms that take advantage of this capability are called parallel algorithms.
• Choice of computational devices like Processor and memory is mainly based on
space and time efficiency

16
Decision making

The Decision making is done on the following:

(b) Choosing between Exact and Approximate Problem Solving

• An algorithm used to solve the problem exactly and produce correct result is called an
exact algorithm.
• If the problem is so complex and not able to get exact solution, then we have to choose
an algorithm called an approximation algorithm.

Examples: extracting square roots, solving nonlinear equations, and evaluating definite
integrals.

17
Decision making
The Decision making is done on the following:

( c ) Algorithm Design Techniques


• An algorithm design technique is a general approach to solving problems algorithmically
that is applicable to a variety of problems from different areas of computing.
Algorithms + Data Structures = Programs
• The choice of proper data structure is required before designing the algorithm.
• Implementation of algorithm is possible only with the help of Algorithms and Data
Structures
• Some of Algorithmic technique Brute Force, Divide and Conquer, Dynamic Programming,
Greedy Technique and so on.

18
Methods of Specifying an Algorithm

There are three ways to specify an algorithm.

Algorithm Spesifications

( a) ( b) ( c)
Natural Language Pseudocode Flowchart

Pseudocode is the option that are most widely used nowadays for specifying algorithms.
19
Methods of Specifying an Algorithm
a. Natural Language
• It is very simple and easy to specify an algorithm
Drawbacks:
many times provides a brief specification.

Example: An algorithm to perform addition of two numbers.


Step 1: Read the first number, say a.
Step 2: Read the second number, say b.
Step 3: Add the above two numbers and store the result in c.
Step 4:( a)
Display the result from c.
ThinkNatural
of implementing
Language this ?

• Many programmers prefer to have algorithm specification in Pseudocode.

20
Methods of Specifying an Algorithm
b. Pseudocode
• Pseudocode is a mixture of a natural language and programming language constructs.
• Pseudocode is usually more precise than natural language.
For Assignment operation left arrow “←”, for comments two slashes “//”,if condition,
for, while loops are used.
Example:
ALGORITHM Sum(a,b)
//Problem Description: This algorithm performs addition of two numbers
//Input: Two integers a and b
//Output:( a)
Addition of two integers
c←a+b
Natural Language
return c

• This specification is more useful for implementation of any language.


21
Methods of Specifying an Algorithm
c. Flowchart
• Flowchart is a graphical representation of an algorithm by using a collection of
connected geometric shapes containing descriptions of the algorithm’s steps.

• It is the dominant method in the earlier days of computing,


• This representation technique is inconvenient.

Flowchart Symbols Start Start state Input & Output

( a) Transition / Assignment Condition / Decision


Natural Language
Processing/ read input Flow conectivity

Stop Stop state


22
Methods of Specifying an Algorithm
c. Flowchart

Example: Addition two integers a & b

Note: flowcharts can be found only in old


algorithm books . Pseudocode is dominate
method

23
Proving an Algorithm’s Correctness

• Algorithm's correctness comes after the specification of an algorithm.


• An algorithm must give a required result for every legitimate input in a finite amount of
time.

• A mathematical induction is a common technique for proving correctness of an algorithm.


It is used because an algorithm’s iterations provide a natural sequence of steps

• The notion of correctness for approximation algorithms is less straightforward than it is for
exact algorithms. The error produced by the algorithm should not exceed a predefined
limit.

24
Analyzing an Algorithm

• There are two kinds of algorithm efficiency. They are:


Time efficiency, indicating how fast the algorithm runs, and
Space efficiency, indicating how much extra memory it uses.

• The efficiency of an algorithm is determined by measuring both time efficiency and


space efficiency.

So factors to analyze an algorithm are:


➢ Time efficiency of an algorithm
➢ Space efficiency of an algorithm
➢ Simplicity of an algorithm
➢ Generality of an algorithm

25
Coding an Algorithm

• The coding / implementation of an algorithm is done by a suitable programming


language like C, C++, JAVA.

• It is very essential to write an optimized code (efficient code) to reduce the burden of
compiler.

26
Important Problem Types
Important Problem Types

The most important problems types are :


➢ Sorting
➢ Searching
➢ String processing
➢ Graph problems
➢ Combinatorial problems
➢ Geometric problems
➢ Numerical problems

28
Important Problem Types

(i) Sorting
• The sorting problem is to rearrange the items of a given list in nondecreasing
(ascending) order.
• Sorting can be done on numbers, characters, strings or records.
• To sort student records in alphabetical order of names or by student number or by
student grade-point average. Such a specially chosen piece of information is called a
key.
• An algorithm is said to be in-place if it does not require extra memory, E.g., Quick sort.
• A sorting algorithm is called stable if it preserves the relative order of any two equal
elements in its input.

29
Important Problem Types

(ii) Searching
• The searching problem deals with finding a given value, called a search key, in a given
set.
E.g., Ordinary Linear search and fast binary search.

(iii) String processing


• A string is a sequence of characters from an alphabet.
• Strings comprise letters, numbers, and special characters; bit strings, which comprise
zeros and ones; and gene sequences, which can be modeled by strings of characters
from the four character alphabet {A, C, G, T}. It is very useful in bioinformatics.
• Searching for a given word in a text is called string matching

30
Important Problem Types

(iv) Graph problems


• A graph is a collection of points called vertices, some of which are connected by line
segments called edges.
• Some of the graph problems are graph traversal, shortest path algorithm, topological
sort, traveling salesman problem and the graph-coloring problem and so on.

(v) Combinatorial problems


• These are problems that ask, explicitly or implicitly, to find a combinatorial object such
as a permutation, a combination, or a subset that satisfies certain constraints.
• A desired combinatorial object may also be required to have some additional property
such as a maximum value or a minimum cost.
• In practical, the combinatorial problems are the most difficult problems in computing.
The traveling salesman problem and the graph coloring problem are examples of
combinatorial problems.
31
Important Problem Types

(vi) Geometric problems


• Geometric algorithms deal with geometric objects such as points, lines, and polygons.
• Geometric algorithms are used in computer graphics, robotics, and tomography.
• The closest-pair problem and the convex-hull problem are comes under this category.

(vii) Numerical problems


• Numerical problems are problems that involve mathematical equations, systems of
equations, computing definite integrals, evaluating functions, and so on.
• The majority of such mathematical problems can be solved only approximately.

32
Activity !!!
In groups , state the
basic types of data
structure and define
their features ?

33
Summary

• Algorithms for the same problem can be based on very different ideas and can solve
the problem with dramatically different speeds.
• Input, output, definiteness, finiteness and efficiency are the main characteristics of any
algorithms
• We introduced the major steps for writing an algorithms and also the steps for
designing and analysis an algorithms in details.
• Sorting problems , Searching problems , String matching problems Graph problems ,
Combinatorial problems , Geometric problems and Numerical problems
• Algorithm design techniques (or “strategies” or “paradigms”) are general
approaches to solving problems algorithmically, applicable to a variety of problems
from different areas of computing.

34
List of Readings

➢ Required Reading :
Chapter 1 from Anany Levitin (2012) . Introduction to the
Design and Analysis of Algorithms. Pearson, 2012, 3rd edition

➢ Recommending Reading :
Chapter 1 from Thomas H. Cormen, Charles E. Leiserson, Ronald L.
Rivest, Clifford Stein ( 2022 ). Introduction to Algorithms. MIT Press.
.Fourth edition

Chapter 1 from Michael T.Goodrich, Roberto Tamassia (2014).


Algorithm Design and Applications. First Edition Wiley.

35
Thank You

36
‫ر‬
‫الجامعة السعودية االلكتونية‬
‫ر‬
‫االلكتونية‬ ‫الجامعة السعودية‬

‫‪26/12/2021‬‬
College of Computing and Informatics

Design and Analysis of Algorithms


Design and Analysis of Algorithms

Module 2
Algorithms Analysis Methods
Week Learning Outcomes

Express of analysis of algorithms framework and analysis framework

Explain the growth function Asymptotic notation, functions, and running times

Apply analysis of algorithm framework for recursive and non-recursive functions

40
Topic Outline

➢ The algorithm analysis framework


➢ Asymptotic notations and basic efficiency classes
➢ Application of mathematical analysis for non- recursive and
recursive algorithms
➢ Empirical Analysis of Algorithm

41
The Analysis Framework
Analysis of algorithms

• Issues:
• correctness
• time efficiency
• space efficiency
• optimality

• Approaches:
• empirical analysis – less useful
• theoretical analysis – most important

43
The Analysis Framework

There are two kinds of efficiency:

Time efficiency also can be called time complexity indicating how fast the algorithm
runs.
Space efficiency, also called space complexity indicating the amount of memory
units required by the algorithm as well as the space needed for its input and output.

The algorithm analysis framework consists of the following:


• Measuring an Input’s Size
• Units for Measuring Running Time
• Orders of Growth
• Worst-Case, Best-Case, and Average-Case Efficiencies

44
Theoretical analysis of time efficiency

Time efficiency is analyzed by determining the number of repetitions of


the basic operation as a function of input size

• Basic operation: the operation that contributes most towards the


running time of the algorithm
input size

T(n) ≈ copC(n)
Number of times
running time execution time basic operation is
for basic operation executed

45
Measuring an Input’s Size
• An algorithm’s efficiency is defined as a function of some parameter n indicating
the algorithm’s input size.
.
• For the problem of evaluating a polynomial 𝑝 𝑥 = 𝑎𝑛 𝑥 𝑛 + … . +𝑎0 of degree n, the
size of the parameter will be the polynomial’s degree or the number of its
coefficients, which is larger by 1 than its degree.
• In computing the product of two n ✕ n matrices, the choice of a parameter indicating
an input size does matter.
• Consider a spell-checking algorithm. If the algorithm examines individual characters
of its input, then the size is measured by the number of characters.
• In measuring input size for algorithms solving problems such as checking primality
of a positive integer n. The input is just one number.
• The input size by the number b of bits in the n’s binary representation is
𝑏 = ( log 2 𝑛) + 1 .
46
Units for Measuring Running Time
• Some standard unit of time measurement such as a second, or millisecond, and so on can
be used to measure the running time of a program after implementing the algorithm.

Drawbacks
1. Dependence on the speed of a particular computer.
2. Dependence on the quality of a program implementing the algorithm.
3. The compiler used in generating the machine code.
4. The difficulty of clocking the actual running time of the program.

• One possible approach is to count the number of times each of the algorithm’s operations
is executed. This approach is excessively difficult.
• The most important operation (+, -, *, /) of the algorithm, called the basic operation.
Computing the number of times the basic operation is executed is easy. The total running
time is determined by basic operations count.
47
Input size and basic operation examples

Problem Input size measure Basic operation

Searching for key in a list of n


Number of list’s items, i.e. n Key comparison
items

Matrix dimensions or total number Multiplication of two


Multiplication of two matrices
of elements numbers

Checking primality of a given n’size = number of digits (in binary


Division
integer n representation)

Visiting a vertex or
Typical graph problem #vertices and/or edges
traversing an edge

48
Orders of Growth

• A difference in running times on small inputs is not what really distinguishes efficient
algorithms from inefficient ones.

• For example, the greatest common divisor of two small numbers, it is not immediately
clear how much more efficient Euclid’s algorithm is compared to the other algorithms, the
difference in algorithm efficiencies becomes clear for larger numbers only.

49
Orders of Growth
• For large values of n, it is the function’s order of growth that counts As shown in table
below, which contains values of a few functions particularly important for analysis of
algorithms

50
Worst-case, Best-case, and Average-case Efficiencies
• Consider Sequential Search algorithm some search key K
ALGORITHM SequentialSearch(A[0..n - 1], K)
//Searches for a given value in a given array by sequential search
//Input: An array A[0..n - 1] and a search key K
//Output: The index of the first element in A that matches K or -1 if there are no
// matching elements
i ←0
while i < n and A[i] ≠ K do Clearly, the running time of this algorithm can be quite
i ←i + 1 different for the same list size n.
if i < n return i
else return -1

In the worst case, there is no matching of elements or the first matching element
can found at last on the list.
In the best case, there is matching of elements at first on the list. 51
Worst-case Efficiency

• The worst-case efficiency of an algorithm is its efficiency for the worst case input of size
𝑛.
• The algorithm runs the longest among all possible inputs of that size.
• For the input of size 𝑛, the running time is 𝐶𝑤𝑜𝑟𝑠𝑡 𝑛 = 𝑛 .
• The worst-case analysis provides very important information about an algorithm’s
efficiency by bounding its running time from above it guarantees that for any
instance of size n, the running time will not exceed 𝐶𝑤𝑜𝑟𝑠𝑡 𝑛 , its running time on
the worst-case inputs

Think about Worst-Case efficiency in Sequential Search !!!

52
Best-case Efficiency

• The best-case efficiency of an algorithm is its efficiency for the best case input of size 𝑛.
• The algorithm runs the fastest among all possible inputs of that size 𝑛.

• In sequential search, If we search a first element in list of size 𝑛. (i.e. first element equal
to a search key), then the running time is 𝐶𝑏𝑒𝑠𝑡 𝑛 = 1

53
Average-case Efficiency

• The Average case efficiency lies between best case and worst case.
• To analyze the algorithm’s average case efficiency, we must make some assumptions
about possible inputs of size 𝑛.

• The standard assumptions are that


o The probability of a successful search is equal to 𝑝 0 ≤ 𝑝 ≤ 1 and
o The probability of the first match occurring in the 𝑖𝑡ℎ position of the list is the same
for every 𝑖 .

54
Average case efficiency

In the case of a successful search, the probability of the first match occurring in the 𝑖𝑡ℎ
position of the list is 𝑝/𝑛 for every 𝑖, and the number of comparisons made by the
algorithm in such a situation is obviously 𝑖.

In the case of an unsuccessful search, the number of comparisons will be 𝑛 with the
probability of such a search being (1 − 𝑝). Therefore,

55
Amortized Efficiency

• Amortized efficiency applies not to a single run of an algorithm but rather to a


sequence of operations performed on the same data structure.

• In some situations ,a single operation can be expensive, but the total time for an
entire sequence of 𝑛 such operations is always significantly better than the worst-
case efficiency of that single operation multiplied by 𝑛.

• “Amortize” the high cost of such a worst-case occurrence over the entire sequence in
a manner similar to the way a business would amortize the cost of an expensive item
over the years of the item’s productive life.

• This sophisticated approach was discovered by Robert Tarjan, who used it, among
other applications, in developing an interesting variation of the classic binary search
tree
56
Key Points of the Analysis Framework

• Both time and space efficiencies are measured as functions of the algorithm’s input size.
• Time efficiency is measured by counting the number of times the algorithm’s basic
operation is executed.
• Space efficiency is measured by counting the number of extra memory units consumed by
the algorithm.
• The efficiencies of some algorithms may differ significantly for inputs of the same size.
• For such algorithms, we need to distinguish between the worst-case, average-case, and
best-case efficiencies.
• The framework’s primary interest lies in the order of growth of the algorithm’s running time
(extra memory units consumed) as its input size goes to infinity.

57
Asymptotic Notations and Basic Efficiency Classes
Asymptotic notation
• Asymptotic notation is a notation, which is used to take meaningful statement about the
efficiency of a program.
• The efficiency analysis framework concentrates on the order of growth of an algorithm’s
basic operation count as the principal indicator of the algorithm’s efficiency.
• To compare and rank such orders of growth, computer scientists use three notations, they
are:

O - Big oh notation
Ω - Big omega notation
Θ - Big theta notation

59
Asymptotic notation
• Let 𝑡(𝑛) and 𝑔(𝑛) can be any nonnegative functions defined on the set of natural
numbers.
• The algorithm’s running time 𝑡(𝑛) usually indicated by its basic operation count 𝐶(𝑛), and
𝑔(𝑛), some simple function to compare with the count.

Example 1:

𝑶(𝒈(𝒏)) is the set of all functions with a lower or same order of growth as 𝑔(𝑛) (to within a
constant multiple, as 𝑛 goes to infinity).
Thus, to give a few examples, the following assertions are all true:

n ∈ O(n2), 100n + 5 ∈ O(n2),

60
Asymptotic notation
Example 1:
• The second notation, Ω (𝒈(𝒏)), stands for the set of all functions with a higher or same
order of growth as 𝒈(𝒏) (to within a constant multiple, as 𝑛 goes to infinity).
For example,

• The third notian ⍬(𝑔(𝑛)) is the set of all functions that have the same order of
growth as 𝑔(𝑛) (to within a constant multiple, as n goes to infinity). Thus, every
quadratic function
is in ⊝ (𝑛2 )

61
O-notation

A function 𝑡(𝑛) is said to be in 𝑂(𝑔(𝑛)),


denoted 𝑡 (𝑛) ∈ 𝑂(𝑔(𝑛)),
if 𝑡(𝑛) is bounded above by some constant
multiple of 𝑔(𝑛) for all large 𝑛,
i.e., if there exist some positive constant 𝑐 and
some nonnegative integer 𝑛0 such that

t(n) ≤ cg(n) for all n ≥ n0.

Where 𝑡(𝑛) and 𝑔(𝑛) are nonnegative functions


defined on the set of natural numbers.

O = Asymptotic upper bound = Useful for worst case analysis = Loose bound

62
O-notation

Example :
formally prove that 100 𝑛 + 5 ∈ 𝑂(𝑛2 )
100 𝑛 + 5 ≤ 100 𝑛 + 𝑛 ∀ 𝑛 ≥ 5
= 101 𝑛 ≤ 101𝑛2
as values of the constants c and n0 required by the definition, we can take 101 and 5,
respectively

• the definition gives us a lot of freedom in choosing specific values for constants c
and n0.
• For example, we could also reason that
100𝑛 + 5 ≤ 100𝑛 + 5𝑛 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑛 ≥ 1 = 105𝑛
i.e., 100𝑛 + 5 ≤ 105𝑛
i.e., 𝑡(𝑛) ≤ 𝑐𝑔(𝑛)
∴ 100 𝑛 + 5 ∈ 𝑂(𝑛) with c=105 and n0=1
63
Ω-notation

• A function 𝑡 (𝑛) is said to be in


Ω (𝑔(𝑛)), denoted 𝑡 𝑛 ∈ Ω 𝑔 𝑛 ,
if 𝑡 (𝑛) is bounded below by some positive
constant multiple of 𝑔(𝑛) for all large 𝑛,

i.e., if there exist some positive constant 𝑐 and


some nonnegative integer 𝑛0 such that
𝑡 (𝑛) ≥ 𝑐𝑔(𝑛) 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑛 ≥ 𝑛0 .

• Ω = Asymptotic lower bound = Useful for


best case analysis = Loose bound

64
Ω-notation

Example
Prove the assertions
𝑛3 + 10 𝑛2 + 4 𝑛 + 2 ∈ Ω (𝑛2 ).
Proof:
𝑛3 + 10 𝑛2 + 4 𝑛 + 2 ≥ 𝑛2 (for all n ≥ 0)

i.e., by definition 𝑡(𝑛) ≥ 𝑐𝑔(𝑛), where 𝑐 = 1


and 𝑛0 =0

65
⍬-notation

• A function 𝑡(𝑛) is said to be in ⍬(𝑔(𝑛)),


denoted 𝑡(𝑛) ∈ ⍬(𝑔(𝑛)), if 𝑡(𝑛) is
bounded both above and below by some
positive constant multiples of g(n) for all
large n,

i.e., if there exist some positive constants 𝑐1 c1


and 𝑐2 and some nonnegative integer 𝑛0 such
that
𝑐2 g(n) ≤ t(n) ≤ 𝑐1 g(n) for all 𝑛 ≥ 𝑛0

• Where 𝑡(𝑛) and 𝑔(𝑛) are nonnegative functions defined on the set of natural numbers.
• Θ = Asymptotic tight bound = Useful for average case analysis
66
⍬-notation
Example

67
MATHEMATICAL ANALYSIS FOR RECURSIVE ALGORITHMS
Mathematical Analysis of Recursive Algorithms

General Plan for Analyzing the Time Efficiency of Recursive Algorithms

1. Decide on a parameter (or parameters) indicating an input’s size.


2. Identify the algorithm’s basic operation.
3. Check whether the number of times the basic operation is executed can vary on
different inputs of the same size; if it can, the worst-case, average-case, and best-case
efficiencies must be investigated separately.
4. Set up a recurrence relation, with an appropriate initial condition, for the number of
times the basic operation is executed.
5. Solve the recurrence or, at least, ascertain the order of growth of its solution.

69
Mathematical Analysis of Recursive Algorithms

EXAMPLE 1:
Compute the factorial function 𝐹(𝑛) = 𝑛! for an arbitrary nonnegative integer n.
Since 𝑛! = 1 •. . . .• (𝑛 − 1) • 𝑛 = (𝑛 − 1)! • 𝑛, 𝑓𝑜𝑟 𝑛 ≥ 1 𝑎𝑛𝑑 0! = 1
by definition, we can compute 𝐹(𝑛) = 𝐹(𝑛 − 1) • 𝑛 with the following recursive
algorithm.

ALGORITHM 𝐹(𝑛)
//Computes n! recursively
//Input: A nonnegative integer n
//Output: The value of n!
if n = 0 return 1
else return F(n − 1) * n

70
Mathematical Analysis of Recursive Algorithms

Algorithm analysis
• For simplicity, we consider n itself as an indicator of this algorithm’s input size. i.e. 1.
• The basic operation of the algorithm is multiplication, whose number of executions we
denote M(n). Since the function F(n) is computed according to the formula
F(n) = F(n −1)•n for n > 0.
• The number of multiplications M(n) needed to compute it must satisfy the equality

• M(n − 1) multiplications are spent to compute F(n − 1), and one more multiplication is
needed to multiply the result by n.

71
Mathematical Analysis of Recursive Algorithms

Algorithm analysis - Recurrence relations


• To solve the recurrence relation 𝑀(𝑛) = 𝑀(𝑛 − 1) + 1 , i.e., to find an explicit
formula for 𝑀(𝑛) in terms of n only.
• To determine a solution uniquely, we need an initial condition that tells us the value with
which the sequence starts. We can obtain this value by inspecting the condition that
makes the algorithm stop its recursive calls:
if n = 0 return 1.

1- Since the calls stop when n = 0, the smallest value of n for which this algorithm
is executed and hence M(n) defined is 0.
2- By inspecting the pseudocode’s exiting line, we can see that when n = 0, the
algorithm performs no multiplications.

72
Mathematical Analysis of Recursive Algorithms

Algorithm analysis - Recurrence relations

Thus, the recurrence relation and initial condition for the algorithm’s number of
multiplications
• 𝑀(𝑛):
• 𝑀(𝑛) = 𝑀(𝑛 − 1) + 1 𝑓𝑜𝑟 𝑛 > 0,
• 𝑀(0) = 0 𝑓𝑜𝑟 𝑛 = 0.

73
Mathematical Analysis of Recursive Algorithms

Algorithm analysis - Recurrence relations

Method of backward substitutions

74
Mathematical Analysis of Recursive Algorithms

Algorithm analysis - Recurrence relations

Method of backward substitutions

75
Activity !!!
• Using method of backward
substitutions Apply similar
analysis for Recursive solution
to the Tower of Hanoi puzzle

76
https://www.codeproject.com/Articles/393158/Towers-of-Hanoi
Analyzing the Time Efficiency of Non-recursive Algorithms

General Plan for Analyzing the Time Efficiency of Non- recursive Algorithms
1. Decide on a parameter (or parameters) indicating an input’s size.
2. Identify the algorithm’s basic operation (in the innermost loop).
3. Check whether the number of times the basic operation is executed depends only on
the size of an input. If it also depends on some additional property, the worst-case,
average-case, and, if necessary, best-case efficiencies have to be investigated
separately.
4. Set up a sum expressing the number of times the algorithm’s basic operation is
executed.
5. Using standard formulas and rules of sum manipulation either find a closed form
formula for the count or at the least, establish its order of growth.

77
Analyzing the Time Efficiency of Non-recursive Algorithms

EXAMPLE 1:
Consider the problem of finding the value of the largest element in a list of n
numbers. Assume that the list is implemented as an array for simplicity

ALGORITHM 𝑀𝑎𝑥𝐸𝑙𝑒𝑚𝑒𝑛𝑡(𝐴[0. . 𝑛 − 1])


//Determines the value of the largest element in a given array
//Input: An array A[0..n − 1] of real numbers
//Output: The value of the largest element in A
maxval ←A[0]
for i ←1 to n − 1 do
if A[i]>maxval
maxval←A[i]
return maxval
78
Analyzing the Time Efficiency of Non-recursive Algorithms

Algorithm analysis
• The measure of an input’s size here is the number of elements in the array, i.e., n.
• There are two operations in the for loop’s body:
o The comparison A[i]> maxval and
o The assignment maxval←A[i].

• The comparison operation is considered as the algorithm’s basic operation, because


the comparison is executed on each repetition of the loop and not the assignment.

• The number of comparisons will be the same for all arrays of size n; therefore, there
is no need to distinguish among the worst, average, and best cases here.

79
Analyzing the Time Efficiency of Non-recursive Algorithms
Algorithm analysis

• Let C(n) denotes the number of times this comparison is executed. The algorithm
makes one comparison on each execution of the loop, which is repeated for each
value of the loop’s variable i within the bounds 1 and n − 1, inclusive. Therefore, the
sum for C(n) is calculated as follows:

80
Analyzing the Time Efficiency of Non-recursive Algorithms

EXAMPLE 2:
Consider the element uniqueness problem: check whether all the Elements in a given array
of n elements are distinct.

ALGORITHM UniqueElements(A[0..n − 1])


//Determines whether all the elements in a given array are distinct
//Input: An array A[0..n − 1]
//Output: Returns “true” if all the elements in A are distinct and “false” otherwise
for i ←0 to n − 2 do
for j ←i + 1 to n − 1 do
if A[i]= A[j ] return false
return true

81
Analyzing the Time Efficiency of Non-recursive Algorithms

Algorithm analysis
• The natural measure of the input’s size here is again n (the number of elements in
the array). Since the innermost loop contains a single operation (the comparison of
two elements), we should consider it as the algorithm’s basic operation.
• The number of element comparisons depends not only on n but also on whether
there are equal elements in the array and, if there are, which array positions they
occupy. We will limit our investigation to the worst case only.
• One comparison is made for each repetition of the innermost loop, i.e., for each
value of the loop variable j between its limits i + 1 and n − 1; this is repeated for each
value of the outer loop, i.e., for each value of the loop variable i between its limits 0
and n − 2.

82
Analyzing the Time Efficiency of Non-recursive Algorithms

Algorithm analysis

83
Empirical Analysis of Algorithm

• The principal alternative to the mathematical analysis of an algorithm’s efficiency is


its empirical analysis.
• It is useful when the mathematical analysis is difficult
General Plan for the Empirical Analysis of Algorithm Time Efficiency
1.Understand the experiment’s purpose.
2.Decide on the efficiency metric M to be measured and the measurement unit (an
operation count vs. a time unit).
3.Decide on characteristics of the input sample (its range, size, and so on).
4.Prepare a program implementing the algorithm (or algorithms) for the
experimentation.
5.Generate a sample of inputs.
6.Run the algorithm (or algorithms) on the sample’s inputs and record the data
observed.
7.Analyze the data obtained. 84
Summary

• Time efficiency and Space efficacy are two major elements for analysis any algorithms
• We introduced the analysis framework and show how can be implemented and
explained Asymptotic notation.
• For some algorithms, the running time can differ considerably for inputs of the same
size, leading to worst-case efficiency, average-case efficiency, and best-case efficiency.
• The established framework for analyzing time efficiency is primarily grounded in the
order of growth of the algorithm’s running time as its input size goes to infinity.
• We showed the mathematical analysis for recursive and non-recursive algorithms with
examples
• Empirical analysis of an algorithm is performed by running a program implementing
the algorithm on a sample of inputs and analyzing the data observed (the basic
operation’s count or physical running time). This often involves generating
pseudorandom numbers.

85
List of Readings

➢ Required Reading :
Chapter 2 from Anany Levitin (2012) . Introduction to the
Design and Analysis of Algorithms. Pearson, 2012, 3rd edition

➢ Recommending Reading :
Chapter 1 from Thomas H. Cormen, Charles E. Leiserson, Ronald L.
Rivest, Clifford Stein ( 2022 ). Introduction to Algorithms. MIT Press.
.Fourth edition

Chapter 1 from Michael T.Goodrich, Roberto Tamassia (2014).


Algorithm Design and Applications. First Edition Wiley.

86
Thank You

87
‫ر‬
‫الجامعة السعودية االلكتونية‬
‫ر‬
‫االلكتونية‬ ‫الجامعة السعودية‬

‫‪26/12/2021‬‬
College of Computing and Informatics

Design and Analysis of Algorithms


Design and Analysis of Algorithms

Module 3
Brute Force and Exhaustive Search
Week Learning Outcomes

Describe Brute force and exhaustive search

Recognize two sorting problems solved by Brute force

Demonstrate sequential search and brute-force in string matching problem

Demonstrate brute-force in solving closest-Pair

Distinguish between the Depth-First Search and Breadth-First Search


Topic Outline

➢ Selection Sort and Bubble Sort

➢ Sequential Search and Brute-Force String Matching

➢ Solving Closest-Pair Problem by Brute Force

➢ Exhaustive Search

➢ Depth-First Search and Breadth-First Search


Introduction
• Brute force is a straightforward approach to solving a problem,
• It is based on the problem statement and definitions of the concepts involved.
• The “force” implied by the strategy’s definition → ‘ Just do it ‘
• The brute-force strategy is one of is easiest strategy
• Generally , it is inefficient but can be useful for solving small-size instances of a
problem
Selection Sort and Bubble Sort
Selection Sort
• Selection sort started by scanning the entire given list to find its smallest element
and exchange it with the first element, putting the smallest element in its final
position in the sorted list.
• Then we scan the list, starting with the second element, to find the smallest among
the last n − 1 elements and exchange it with the second element, putting the
second smallest element in its final position
• Generally, on the ith pass through the list, which we number from 0 to n − 2, the
algorithm searches for the smallest item among the last n − i elements and swaps
it with Ai:
Selection Sort
Selection Sort
• Example
Using Selection sort algorithm sort this list 89, 45, 68, 90, 29, 34, 17

Itration 1
Itration 2
Itration 3
Itration 4

Itration 5
Itration 6
Itration 7
Selection Sort
• Algorithm Analysis
• The input size is given by the number of elements n; the basic operation is
the key comparison A[j ] < A[min].
• The number of times it is executed depends only on the array size and is given
by the following sum:

or

• Selection sort is a 0(n 2) algorithm on all inputs.


Bubble Sort
• Compare adjacent elements of the list and exchange them if they are out of order.
• By doing it repeatedly, we end up “bubbling up” the largest element to the last
position on the list.
• The next pass bubbles up the second largest element, and so on, until after n − 1
passes the list is sorted.
• Pass i (0 ≤ i ≤ n − 2) of bubble sort can be represented by the following diagram:
Bubble Sort
Bubble Sort

• Example
Using Bubble algorithm sort this list 89, 45, 68, 90, 29, 34, 17

and so on
Bubble Sort

• Algorithm Analysis
• The number of key comparisons for the bubble-sort version is the same for all
arrays of size n; it is obtained by a sum that is almost identical to the sum for
selection sort :

• In the worst case of decreasing arrays, it is the same as the number of key
comparisons
Sequential Search and Brute-Force String Matching
Sequential Search

• The sequential algorithm compares successive elements of a given list with a


given search key until either a match is encountered (successful search) or the
list is exhausted without finding a match (unsuccessful search).
• If we append the search key to the end of the list, the search for the key will
have to be successful, and
• So, we can eliminate the end of list check altogether.
• Another enhancement in the algorithm is soring the list. This allows searching in a
list be stopped as soon as an element greater than or equal to the search key is
encountered.
Sequential Search
Brute-Force String Matching

• String Matching problem:


For a given a string of n characters called the text & a string of m characters (m ≤ n)
called the pattern, find a substring of the text that matches the pattern.

We want to find 𝑖 − 𝑡ℎ𝑒 𝑖𝑛𝑑𝑒𝑥 of the leftmost character of the first matching substring
in the text—such that 𝑡𝑖 = 𝑝 0 , … , 𝑡𝑖 + 𝑗 = 𝑝 𝑗 , … , 𝑡𝑖 + 𝑚 − 1 = 𝑝𝑚 − 1:

text T

pattern P
Brute-Force String Matching
Brute-Force String Matching

Algorithm Analysis
• The algorithm shifts the pattern almost always after a single character comparison.

• The worst case : the algorithm may have to make all m comparisons before
shifting the pattern, which it can happen for each of n − m + 1 tries.
• Worst -case : m(n − m + 1) character comparisons ∈ O(nm) class
• Average-case efficiency should be considerably better than the worst-case
efficiency.
• For searching in random texts, it has been shown to be linear, i.e., 0(n)
Closest-Pair Problem
Closest-Pair Problem

• The problem is for finding the two closest points in a set of n points.
• It is the simplest problems in computational geometry dealing with proximity of
points in the plane or higher-dimensional spaces.
• Points represents such physical objects such as airplanes or post offices as well as
database records, statistical samples, DNA sequences … etc.
• Cluster analysis in statistics is one of the important applications of the closest-pair
problem.
• Hierarchical cluster analysis seeks to organize them in a hierarchy of clusters based
on some similarity metric . Usually similarity the Euclidean distance for numerical
analysis, the Hamming distance for text and other nonnumerical data, metrics
such as
Closest-Pair Problem

The brute-force approach for solving Closest-Pair Problem

• Compute the distance between each pair of distinct points and find a pair with
the smallest distance.
• Considering only the pairs of points (pi, p j ) for which i < j , to avoid computing
distance for same pair twice .
Closest-Pair Problem
Closest-Pair Problem

Algorithm Analysis

• The basic operation of the algorithm is computing the square root which The
number of times will be executed can be computed as follows:

• Speeding up the innermost loop of the algorithm could only decrease


the algorithm’s running time by a constant factor
Exhaustive Search
Exhaustive Search

• Exhaustive search is simply a brute-force approach to combinatorial problems.


• It suggests generating each and every element of the problem domain,
selecting those of them that satisfy all the constraints, and then finding a
desired element
e.g., the one that optimizes some objective function
• Its implementation typically requires an algorithm for generating certain
combinatorial objects

• Illustrating exhaustive search into three important problems: the traveling


salesman problem, the knapsack problem, and the assignment problem
Traveling Salesman Problem (TSP)

• TSP has been intriguing researchers for the last 150 years.
• The problem asks to find the shortest tour through a given set of n cities that
visits each city exactly once before returning to the city where it started.
• It can be modeled by a weighted graph, with the graph’s vertices representing
the cities and the edge weights specifying the distances.
• Then the problem can be stated as the problem of finding the shortest
Hamiltonian circuit of the graph.

Hamiltonian circuit can also be defined as a sequence of n+1 adjacent vertices


vi , vi ,. . ., vi , vi , where the first vertex of the sequence is the same as the last
0 1 n−1 0

one and all the other n − 1 vertices are distinct.

• All the tours obtained by generating all the permutations of n − 1


intermediate cities, computing the tour lengths and finding the shortest among
them
Traveling Salesman Problem (TSP)

Illustrating a sample of
Hamiltonian circuit

The total number of permutations


needed is still 1/2 (n − 1)!

Exhaustive-search approach impractical


for all but very small values of n
Knapsack Problem

Given n items of known weights w1, w 2 ,..., wn and values v1, v 2 ,..., vn and a knapsack
of capacity W , find the most valuable subset of the items that fit into the knapsack

Example : a transport plane that has to deliver the most valuable set of items to a
remote location without exceeding the plane’s capacity.
Knapsack Problem

• The exhaustive-search approach to knapsack problem is:


• Generating all the subsets of the set of n items given, computing the total
weight of each subset in order to identify feasible subsets

Algorithm Analysis of exhaustive search for knapsack search:


• The number of subsets of an n-element set is 2n, the exhaustive search leads to a Q(2n)
algorithm, even efficiently individual subsets are generated. → Not best solution

• Both TSM and Knapsack problems are the best-known examples for NP-hard
problems – will be discussed in week 11
Knapsack Problem & The exhaustive-search

Knapsack Problem

exhaustive-search
Assignment Problem
Assignment Problem

• There are n people who need to be assigned to execute n jobs, one person per
job. (That is, each person is assigned to exactly one job and each job is assigned
to exactly one person.)
• The cost that would accrue if the ith person is assigned to the j th job is a known
quantity C[i, j ] for each pair i, j = 1, 2,..., n.
• The problem is to find an assignment with the minimum total cost.
Assignment Problem

• Feasible solutions to the assignment problem as n-tuples (j 1 ,. . ., jn) in which the


ith component, i = 1,. . ., n, indicates the column of the element selected in the
ith row (i.e., the job number assigned to the ith person).

• For example, for the cost matrix above, (2, 3, 4, 1) indicates the assignment of
Person 1 to Job 2, Person 2 to Job 3, Person 3 to Job 4, and Person 4 to Job 1.
Assignment Problem

Algorithm Analysis of exhaustive search for assignment problem:


• The requirements implies that there is a one-to-one correspondence between
feasible assignments and permutations of the first n integers.
• As a result, the exhaustive-search approach to the assignment problem would
require generating all the permutations of integers 1, 2,..., n, computing the
total cost of each assignment by summing up the corresponding elements of the
cost matrix, and finally selecting the one with the smallest sum.

These are few iterations of


exhaustive-search for
assignment problem
Depth-First Search and Breadth-First Search
Depth-First Search ( DFS)
• Examples for applications Depth- First Search

a) Graph
b) Traversal’s stack
c) DFS forest with the
tree and back edges
Depth-First Search ( DFS)
• DFS in Graph :
• Starts a graph’s traversal at an arbitrary vertex by marking it as visited
• On each iteration, the algorithm proceeds to an unvisited vertex that is adjacent
to the one it is currently in.
• If there are several such vertices, it can be resolved arbitrarily.
• This process continues until a dead end—a vertex with no adjacent unvisited
vertices— is encountered.
• At a dead end, the algorithm backs up one edge to the vertex it came from and
tries to continue visiting unvisited vertices from there.
• The algorithm eventually halts after backing up to the starting vertex, with the
latter being a dead end. By then, all the vertices in the same connected
component as the starting vertex have been visited. If unvisited vertices still
remain, the depth-first search must be restarted at any one of them.
Depth-First Search ( DFS)
Depth-First Search (DFS)

• How efficient is depth-first search?

• It takes just the time proportional to the size of the data structure used for
representing the graph in question.

• For the adjacency matrix representation , the traversal time is in 0(|V |2)

• For the adjacency list representation , it is in 0(|V |+ |E|) where |V | and |E|
are the numbers graph’s vertices and edges, respectively
Breath-First Search ( BFS)
• Examples for applications Breath- First Search

a) Graph
b) Traversal’s stack
c) BFS forest
Breadth-First Search (BFS)
• It proceeds in a concentric manner by visiting first all the vertices that are
adjacent to a starting vertex, then all unvisited vertices two edges apart from it,
and so on, until all the vertices in the same connected component as the starting
vertex are visited.
• If there still remain unvisited vertices, the algorithm has to be restarted at an
arbitrary vertex of another connected component of the graph.

• It is convenient to use a queue. The queue is initialized with the traversal’s


starting vertex, which is marked as visited.
• On each iteration, the algorithm identifies all unvisited vertices that are adjacent
to the front vertex, marks them as visited, and adds them to the queue; after
that, the front vertex is removed from the queue.
Breadth-First Search (BFS)
Breadth-First Search (BFS)

Analysis of BFS:
• Breadth-first search has the same efficiency as depth-first search :
• For the adjacency matrix representation , it is in 0(|V |2) and
• For the adjacency list representation , it is 0(|V |+ |E|).
• In addition , the queue is a FIFO (first-in first-out) structure adds a single
ordering of vertices.
Summary

• Brute force is a straightforward approach to solving a problem, usually directly


based on the problem statement and definitions of the concepts involved.
• A first application of the brute-force approach often results in an algorithm that
can be improved with a modest amount of effort.
• The following noted algorithms can be considered as examples of the brute- force
approach: definition-based algorithm for matrix multiplication selection sort ,
sequential search and straightforward string-matching algorithm
• Exhaustive search is a brute-force approach to combinatorial problems. It
suggests generating each and every combinatorial object of the problem,
selecting those of them that satisfy all the constraints, and then finding a desired
object.
Summary

• The traveling salesman problem, the knapsack problem, and the assignment
problem are typical examples of problems that can be solved, at least
theoretically, by exhaustive-search algorithms.
• Depth-first search (DFS) and breadth-first search (BFS) are two principal graph-
traversal algorithms. By representing a graph in a form of a depth-first or breadth-
first search forest, they help in the investigation of many important properties of
the graph. Both algorithms have the same time efficiency: 0(|V |2) for the
adjacency matrix representation and 0(|V |+ |E|) for the adjacency list
representation.
List of Readings

➢ Required Reading :
Chapter 3 from Anany Levitin (2012) . Introduction to the
Design and Analysis of Algorithms. Pearson, 2012, 3rd edition
Thank You
‫ر‬
‫الجامعة السعودية االلكتونية‬
‫ر‬
‫االلكتونية‬ ‫الجامعة السعودية‬

‫‪26/12/2021‬‬
College of Computing and Informatics

Design and Analysis of Algorithms


Design and Analysis of Algorithms

Module 4
Decrease-and-Conquer
Week Learning Outcomes

Describe Decrease-and-Conquer and their various types

Describe the insert sort , topological Sorting and analysis their complexity

Recognize Constant-Factor Algorithms and variable size algorithms


Topic Outline

➢ Insertion Sort
➢ Topological Sorting
➢ Decrease-by-a-Constant-Factor Algorithms
➢ Variable-Size-Decrease Algorithms
Introduction

• The decrease-and-conquer technique is Problem (𝑥)


based on exploiting the relationship
Step 1: Decrease
between a solution to a given instance
of a problem and a solution to its Subproblem ( 𝑥 ′ )
smaller instance. Step 2: Conquer

Sub-Solution
• Once such a relationship is established, Step 3: Combine
it can be exploited either top down or
bottom up. Solution

Illustration of Decrease-(by one)-and-conquer technique

https://www3.cs.stonybrook.edu/~pramod.ganapathi/doc/algorithms/Algo-DecreaseConquer.pdf
Introduction

• The top down variations is usually implemented using recursive approach.


• The bottom-up variation is usually implemented iteratively, starting with a solution to
the smallest instance of the problem; it is called sometimes the incremental
approach.
• There are three major variations of decrease-and-conquer:
1- Decrease by a constant
2- Decrease by a constant factor
3- Variable size decrease

Teplogical Sotring (1)

https://www.scaler.com/topics/data-structures/topological-sort-algorithm/m/ (1)
Decrease by a constant

• decrease-by-a-constant variation, the size of an


instance is reduced by the same constant on
each iteration of the algorithm.
• This constant is equal to one , although other
constant size reductions do happen occasionally.
• The exponentiation problem of computing an
where a /= 0 and n is a nonnegative integer.
• The relationship between a solution to an instance of
size n and an instance of size n − 1 is obtained by the
obvious formula an = an−1 f (n) = an
Decrease by a constant factor

• The problem instance in decrease-by-a-


constant-factor technique is reduced by the
same constant factor on each iteration of
the algorithm.
• In most applications, this constant factor is
equal to two.

Give an example of such an algorithm?


Decrease by a constant factor
The techinque analysis:

• The algorithm is in 0(log n) if we compute an


recursively according to this formula and
measure the algorithm’s efficiency by the
number of multiplications.

• The reason for that is reduction on the size by


about a half on each iteration as well as the
expense of one or two multiplications.
Variable size decrease

• The size reduction pattern varies at each


iteration
• A good example of such approach is Euclid’s
algorithm for computing the greatest
common divisor.
• This algorithm is based on the formula
gcd(m, n) = gcd(n, m mod n )

• The value of the second argument is always


smaller on the right-hand side than on the
left-hand side. Euclid’s algorithm example (1)

1. https://scienceland.info/en/algebra8/euclid-algorithm
Insertion Sort
Insertion Sort
• Lets have a sorted array A[0..n − 1] , a new element x
need to be inserted in the array
• It is required to find an appropriate position for A[n − 1]
among the sorted elements and insert new element x
there.
• By scanning the sorted subarray from right to left until
the first element smaller than or equal to A[n − 1] is
encountered to insert A[n − 1] right after that element.
• The resulting algorithm is called straight insertion sort
or simply insertion sort.
• Insertion sort is clearly based on a recursive idea and
more
• Bottom top implementation is more efficient
https://programmercave0.github.io/blog/2017/08/20/C++-Insertion-Sort-using-STL-%28Sorting%29
Insertion Sort

https://programmercave0.github.io/blog/2017/08/20/C++-Insertion-Sort-using-STL-%28Sorting%29
Insertion Sort
The algorithm analysis:
• The basic operation of the algorithm is the key comparison A[j ] > v .
• The number of comparison depends on input nature.
• In the worst case, A[j ] >v is executed the largest number of times,
i.e., for every j = i − 1, ... , 0.
• Since v = A[i], it happens if and only if A[j ] > A[i] for j = i − 1, ... , 0.
• Thus, for the worst-case input,
we get A[0] > A[1] (for i = 1),
A[1] > A[2] (for i = 2),. . .,
A[n − 2] > A[n − 1] (for i = n − 1)
Insertion Sort
The algorithm analysis:
• The worst-case input is an array of decreasing values

• In the best case, the comparison A[j ] >v is executed only once on every iteration
of the outer loop.
• It happens if and only if A[i − 1] ≤ A[i] for every i = 1,...,n − 1,
i.e., if the input array is already sorted in
non-decreasing order.
Insertion Sort
The algorithm analysis:
• Average-case efficiency is based on investigating the number of element pairs
that are out of order It happens if and only if A[i − 1] ≤ A[i] for every i
= 1,...,n − 1,
• It shows that on randomly ordered arrays, insertion sort makes on average
half as many comparisons as on decreasing arrays,
i.e.,
• This twice-as-fast average-case performance coupled with an excellent
efficiency on almost-sorted arrays
• Insertion sort is better than selection sorting and bubble sort
Topological Sorting
Topological Sorting
What is a directed graph?
• A directed graph, or digraph for short, is a graph with
directions specified for all its edges.

• The adjacency matrix and adjacency lists are still two


principal means of representing a digraph. a) An example of directed graph

• The differences between undirected and directed


graphs
(1) the adjacency matrix of a directed graph does
not have to be symmetric;
(2) an edge in a directed graph has just one (not
two) corresponding nodes in the
digraph’s adjacency lists. b) DFS for forest of the digraph
for the DFS traversal started at a
Topological Sorting
What is a directed graph?
• Traversal algorithms for traversing digraphs are Depth-
first search and breadth-first search

• all four types of edges possible in a DFS forest of a


directed graph: a) An example of directed graph
o tree edges (ab, bc, de),
o back edges (ba) from vertices to their ancestors
o forward edges (ac) from vertices to their descendants in
the tree other than their children
o cross edges (dc), which are none of the aforementioned
types.
b) DFS for forest of the digraph
for the DFS traversal started at a
Topological Sorting
What is a directed graph?
• A directed cycle in a digraph is a sequence of three or
more of its vertices that starts and ends with the same
vertex and in which every vertex is connected to its
immediate predecessor by an edge directed from the
predecessor to the successor. a) An example of directed graph
• For example, a, b, a is a directed cycle in the digraph in
the figure a.
• If a DFS forest of a digraph has no back edges, the
digraph is a dag, an acronym for directed acyclic graph

b) DFS for forest of the digraph


for the DFS traversal started at a
Topological Sorting
Motivation Example
• Consider a set of five required courses
• {C1, C2, C3, C4, C5} a part-time student has to take in some degree program.
• The courses can be taken in any order as long as the following course
prerequisites are met:
C1 and C2 have no prerequisites
C3 requires C1 and C2
C4 requires C3
C5 requires C3 and C4.

The student can take only one course per term. In which order should the student
take the courses?
Topological Sorting
Motivation Example

• Modelling the problem into a directed graph


representing the prerequisite structure of five courses

• This is a topological sorting problem


finding the order a student needs to follow
• Topological sorting of vertices of a directed acyclic graph
is an ordering of the vertices𝑣1 , 𝑣2 , . . . , 𝑣𝑛
in such a way that there is an edge directed towards vertex 𝑣𝑗 from
vertex𝑣𝑖 , then𝑣𝑖 . comes before 𝑣𝑗

Note : if a digraph has no directed cycles, the topological sorting


problem for it has a solution
Topological Sorting
The source-removal algorithm

• Repeatedly, identify in a remaining digraph a source, which is a vertex with no


incoming edges, and delete it along with all the edges outgoing from it.
• If there are several sources, break the tie arbitrarily
• The order in which the vertices are deleted yields a solution to the topological
sorting problem.
Topological Sorting
The application for the source-removal algorithm in solving the motivated example
Topological Sorting

The pseudocode for source-removal algorithm


Algorithm
Decrease-by-a-Constant-Factor Algorithms
Binary Search

• Binary search is a remarkably efficient algorithm for searching in a sorted array.


• It works by comparing a search key K with the array’s middle element A[m].
• If they match, the algorithm stops;
• otherwise, the same operation is repeated recursively for the first half of the
array if K < A[m], and for the second half if K > A[m]:
Binary Search

• Given the following array find the Key= 70

Index 0 1 2 3 4 5 6 7 8 9 10 11 12
Value 3 14 27 31 39 42 55 70 74 81 85 93 98
r
Iteration 1 l m
Iteration 2 l m r
Iteration 3 l, m r
Binary Search

Binary search is clearly based on a recursive


idea, but it can be implemented as a non
recursive algorithm.
Here is pseudocode of this non recursive
version.
Binary Search
The algorithm analysis:
• We need to count the number of times the search key is compared with an
element of the array.
• This assumes that after one comparison of K with A[m], the algorithm can
determine whether K is smaller, equal to, or larger than A[m].
• In the worst case Cworst(n). The worst-case inputs include all arrays that do
not contain a given search key, as well as some successful searches.
• Since after one comparison the algorithm faces the same situation but for an
array half the size, we get the following recurrence relation for Cworst(n):
Binary Search
The algorithm analysis:
• For Average-case analysis , A sophisticated analysis shows that the average
number of key comparisons made by binary search is only slightly smaller
than that in the worst case:

• Binary search is an optimal searching algorithm if we restrict our op- erations


only to comparisons between keys
Variable-Size-Decrease Algorithms
Computing a Median and the Selection Problem
The slection problem:
Finding the kth smallest element (or kth order statistic) in a given array A[0..n − 1].
• The easiest case is to extract the Minimum (k=1) and Max ( k= n) values from an
array
• The hardest case is to extract the Median value (k= n/2)

• To find the kth smallest element in a list :


• Sort the list first and then select the kth element in the output of a sorting
algorithm.
• Note : The time of such an algorithm is determined by the efficiency of the sorting
algorithm used. → So, use the efficient sorting algorithm
The Selection Problem

The Randomized partitioning :


Another effective strategy in selection is randomized partitioning a given list around
some value p this is a rearrangement of the list’s elements so that the left part
contains all the elements smaller than or equal to p, followed by the pivot p itself,
followed by all the elements greater than or equal to p

p value Rearrangement around p value


The Selection Problem
The Randomized partitioning :

p value

Rearrangement around p value

https://www3.cs.stonybrook.edu/~pramod.ganapathi/doc/algorithms/Algo-DecreaseConquer.pdf
The Lomuto partitioning

• Any subarray A[l..r] (0 ≤ l ≤ r ≤ n − 1) as composed of three contiguous


segments.
Listed in the order they follow pivot p

• a segment with elements known to be smaller than p,


the segment of elements known to be greater than or equal to p,
and the segment of elements yet to be compared to p

Note that the segments can be empty;


The Lomuto partitioning
The Lomuto partitioning

Applying Lomuto partitioning


algorithm with an example

https://www3.cs.stonybrook.edu/~pramod.ganapathi/doc/algorithms/Algo-DecreaseConquer.pdf
Quick Select
• To find the kth smallest element in array A[0..n − 1]
• let s be the partition’s split position,
i.e., the index of the array’s element occupied by the pivot after
partitioning.
If s = k − 1, pivot p itself is obviously the kth smallest element
If s > k − 1, the kth smallest element in the entire array can be found as the
kth smallest element in the left part of the partitioned array.
if s < k − 1, it can be found as the (k − s)th smallest element in its right part.
if we do not solve the problem, we reduce its instance to a smaller one,
which can be solved by the same approach, i.e., recursively.
Quick Select
Quick Select

Quickselect Algorithm Analysis ?

• Partitioning an n-element array always requires n − 1 key comparisons.


• If it produces the split that solves the selection problem without requiring more
iterations, then for this best case is Cbest(n) = n − 1 ∈ 0(n).
• the algorithm can produce an extremely unbalanced partition of a given array, with one
part being empty and the other containing n − 1 elements.
• In the worst case, this can happen on each of the n − 1 iterations. This implies that:

Cbest(n) = n − 1 ∈ 0(n).
Summary

• Decrease-and-conquer is based on exploiting a relationship between a solution to


a given instance of a problem and a solution to a smaller instance of the same
problem. After establishing relationship , it can be exploited either top down
(usually recursively) or bottom up.
• Three major types of Decrease-and-conquer are decrease-by-a-constant, most often
by one (e.g., insertion sort), decrease-by-a-constant-factor, most often by the factor
of two (e.g., binary search) and variable-size-decrease (e.g., Euclid’s algorithm)
• Insertion sort is a 0(n 2) algorithm both in the worst and average cases, but it is
about twice as fast on average than in the worst case. The algorithm’s notable
advantage is a good performance on almost-sorted arrays.
Summary

• The topological sorting problem asks to list vertices of a digraph in an order such
that for every edge of the digraph, the vertex it starts at is listed before the vertex it
points to.
• Binary search is a very efficient algorithm for searching in a sorted array. It is a
principal example of a decrease-by-a-constant-factor algorithm.
• the size reduction varies from one iteration of the algorithm to another. Examples
of such variable-size- decrease algorithms include Euclid’s algorithm, the partition-
based algorithm for the selection problem
List of Readings

➢ Required Reading :
Chapter 4 from Anany Levitin (2012) . Introduction to the
Design and Analysis of Algorithms. Pearson, 2012, 3rd edition
Thank You
‫ر‬
‫الجامعة السعودية االلكتونية‬
‫ر‬
‫االلكتونية‬ ‫الجامعة السعودية‬

‫‪26/12/2021‬‬
College of Computing and Informatics

Design and Analysis of Algorithms


Design and Analysis of Algorithms

Module 5
Divide-and-Conquer
Week Learning Outcomes

Describe Divide-and-Conquer and their various types

Describe the Mergesort, Quicksort and analysis their complexity

Recognize Binary Tree Traversals and Related Properties

The apply multiplication of large Integers and Strassen’s Matrix Multiplication

Demonstrate the Closest-Pair Problem by Divide-and-Conquer and its analysis


Topic Outline

➢ Divide-and-Conquer Explanation
➢ Merge sort & Quick Sort
➢ Binary Tree Traversals and Related Properties
➢ Multiplication of Large Integers and Strassen’s Matrix
Multiplication
➢ The Closest-Pair Problem by Divide-and-Conquer
Divide-and-Conquer Explanation
Introduction
• Divide-and-conquer is the best-known general
algorithm design technique.
• Divide-and-conquer plan is :
1. Divide : a problem is divided into several
subproblems of the same type, ideally
of about equal size.
2. Conquer: the subproblems are solved recursively.
3. Combine : the solutions to the subproblems are
combined to get a solution to the
original problem.
• A divide-and-conquer algorithm is necessarily more
efficient than even a brute-force solution.
• The divide-and-conquer technique can suited for parallel Divide-and-conquer technique
computations used for solving complex problems
Divide-and-conquer

https://www.enjoyalgorithms.com/blog/merge-sort-algorithm
Introduction

Divide-and-conquer Analysis:
• For an instance of size n can be divided into b instances of size n/b, with a of them
needing to be solved. where a and b are constants; a ≥ 1 and b> 1
• Assuming that size n is a power of b
• The recurrence is for the running time :

T (n) = aT (n/b) + f (n) → divide-and-conquer recurrence

• f (n) is a function that accounts for the time spent on dividing an instance of size n
into instances of size n/b and combining their solutions
• The order of growth of its solution T (n) depends on the values of the constants a and
b and the order of growth of the function f (n)
Divide-and-conquer Analysis

Master Theorem
T (n) = aT (n/b) + f (n) then

Analogous results hold for the O and Q notations.


Example: in sum-computation algorithm (

the recurrence for the number of additions A(n) on inputs of size n = 2k is


A(n) = 2A(n/2) + 1.
where a = 2,b = 2, and d = 0; hence, since a > bd,
Thus, A(n) ∈ 0(nlogb a) = 0(nlog2 2) = 0(n).
Mergesort
Mergesort
• Mergesort is a perfect example of a successful application of the divide-and-
conquer technique.
• It sorts a given array A[0..n − 1] by dividing it into two halves
A[0..Ln/2J− 1] and A[Ln/2J..n − 1],
• Sorting each of them recursively, and
• Then merging the two smaller sorted arrays into a single sorted one.
Mergesort

• Merging process of two sorted arrays can be done as follows:


• Two pointers (array indices) are initialized to point to the first elements of the
arrays being merged.
• The elements pointed to are compared, and the smaller of them is added to a
new array being constructed;
• after that, the index of the smaller element is incremented to point to its
immediate successor in the array it was copied from.
• This operation is repeated until one of the two given arrays is exhausted, and
then the remaining elements of the other array are copied to the end of the
new array.
Mergesort

• Here is an illustrated example

https://www.enjoyalgorithms.com/blog/merge-sort-algorithm
Mergesort Pseudocode
Mergesort Analysis
How efficient is mergesort?
• For simplicity, assume that n is a power of 2, the recurrence relation for the
number of key comparisons C(n) is

• Let us analyze Cmerge(n):


• At each step, exactly one comparison is made, after which the total number of
elements in the two arrays still needing to be processed is reduced by 1.
Mergesort Analysis
How efficient is mergesort?

• In the worst case, neither of the two arrays becomes empty before the other one
contains just one element
• (e.g., smaller elements may come from the alternating arrays).
• or the worst case, Cmerge(n) = n − 1, and the recurrence

Hence, according to the Master Theorem, Cworst(n) ∈ 0(n log n)

For n = 2k , the exact solution to the worst-case recurrence is


Quicksort
Quicksort

• Quicksort divides its input elements according to their position in the


array. The division is done according to their value.

• A partition is an arrangement of the array’s elements so that all the


elements to the left of some element A[s] are less than or equal to A[s],
and all the elements to the right of A[s] are greater than or equal to it

• After a partition is achieved, A[s] will be in its final position in the sorted
array, and we can continue sorting the two subarrays to the left and to the
right of A[s] independently
Quicksort & Hoare Partitioning Pseudocode
Quicksort

• Here is an illustrated example

https://www.java67.com/2014/07/quicksort-algorithm-in-java-in-place-example.html
Hoare Partitioning – in Quick sort

• Here is an other example


explaining Hoare Partitioning

https://www3.cs.stonybrook.edu/~pramod.ganapathi/doc/algorithms/Algo-DecreaseConquer.pdf
Quick sort Analysis

• The number of key comparisons made before a partition is achieved is n + 1 if the


scanning indices cross over and n if they coincide
• If all the splits happen in the middle of corresponding subarrays, we will have the
best case.

• The number of key comparisons in the best case satisfies the recurrence

According to the Master Theorem, Cbest(n) ∈ 0(n log2 n); solving it exactly for

n = 2k yields Cbest(n) = n log2 n.


Quick sort Analysis

• In the worst case,


• if A[0..n − 1] is a strictly increasing array and we use A[0] as the pivot, the left-to-
right scan will stop on A[1] while the right-to-left scan will go all the way to reach
A[0], indicating the split at position 0 .
• After making n + 1 comparisons to get to this partition and exchanging the pivot
A[0] with itself, the algorithm will be left with the strictly increasing array A[1..n −
1] to sort.
• This sorting of strictly increasing arrays of diminishing sizes will continue until the
last one A[n − 2..n − 1] has been processed. The total number of key comparisons
made will be equal to
Quick sort Analysis

Let Cavg(n) be the average number of key comparisons made by quicksort on a


randomly ordered array of size n.
A partition can happen in any position s (0 ≤ s ≤ n − 1) after n + 1 comparisons are
made to achieve the partition.
After the partition, the left and right subarrays will have s and n − 1 − s elements,
respectively.
Assuming that the partition split can happen in each position s with the same
probability 1/n, we get the following recurrence relation:

on the average, quicksort makes


only 39% more comparisons than
in the best case
Activity !!!
• In groups, Compare between
QuickSort implemented with
Hoare Partitioning and
QuickSort implemented with
Lomuto partition ?
Binary Tree Traversals
Binary Tree

• A binary tree T is defined as a finite set of nodes that is


either empty or consists of a root and two disjoint binary
trees TL and TR called, respectively, the left and right
subtree of the root.
• A binary tree as a special case of an ordered tree
• Many problems about binary trees can be solved by
applying the divide-and-conquer technique

https://www.javatpoint.com/binary-tree-java
Binary Tree
• The most important divide-and-conquer algorithms for binary trees are the three
classic traversals: preorder, inorder, and postorder.
• All three traversals visit nodes of a binary tree recursively, i.e., by visiting the
tree’s root and its left and right subtrees. They differ only by the timing of the
root’s visit:
• In the preorder traversal, the root is visited before the left and right subtrees
are visited (in that order).
• In the inorder traversal, the root is visited after visiting its left subtree but
before visiting the right subtree.

• In the postorder traversal, the root is visited after visiting the left and right
subtrees (in that order).
Binary Tree

An example of Binary tree and its traversals.


Multiplication of Large Integers & Strassen’s Matrix
Multiplication
Multiplication of Large Integers
• Some applications like cryptography, require manipulation of integers that are
over 100 decimal digits long.
• This practical need supports investigations of algorithms for efficient
manipulation of large integers.

• Example:
Consider the problem of multiplying two (large) n-digit integers represented by
arrays of their digits such as:

A = 12345678901357986429
B = 87654321284820912836
Multiplication of Large Integers

• The previous example is solved by Solved by grade-school algorithm


• Similar the hand-and pencil method we use for performing multiplication numbers

• Efficiency: 𝑛2 . one-digit multiplications


Multiplication of Large Integers – First Cut
Multiplication of Large Integers – Second Cut
Strassen’s Matrix Multiplication

• The principal insight of the algorithm lies in the discovery that we can
find the product C of two 2 × 2 matrices A and B with just seven
multiplications

Where,
Strassen’s Matrix Multiplication
Algorithm Analysis:

If M(n) is the number of multiplications made by Strassen’s algorithm in


multiplying two n × n matrices

which is smaller than n3 required by the brute-force algorithm


Strassen’s Matrix Multiplication
Algorithm Analysis:
• The number of additions A(n) made by Strassen’s algorithm.
• 18 additions/subtractions of matrices of size n/2; when n = 1, no additions are
made since two numbers are simply multiplied.
• recurrence relation:

• According to the Master Theorem, A(n) ∈ 0(nlog2 7)


• Therefore, the number of additions has the same order of growth as
the number of multiplications
• This puts Strassen’s algorithm in 0(nlog2 7)
The Closest-Pair Problem by Divide-and-Conquer
The Closest-Pair Problem

• One of the efficient way to solve closest-pair problem by divide-and-using brute-


conquer
• The steps for the algorithm:
• Step 1: Divide the points given into two subsets 𝑆1 and 𝑆2 by a vertical line 𝑥 = 𝑐 so
that half the points lie to the left or on the line and half the points lie to the right or
on the line.
The Closest-Pair Problem

• The steps for the algorithm:


Step 2: Find recursively the closest pairs for the left and right subsets.
Step 3: Set d= min{d1, d2}
We can limit our attention to the points in the symmetric vertical strip of width 2d as
possible closest pair. Let 𝐶1 and 𝐶2 be the subsets of points in the left subset 𝑆1 and of
the right subset 𝐶1 , respectively, that lie in this vertical strip. The points in 𝐶1 and 𝐶2 are
stored in increasing order of their y coordinates, which is maintained by merging during
the execution of the next step.

Step 4: For every point 𝑃(𝑥,𝑦)in 𝐶1 , we inspect points in 𝐶2 that may be closer to 𝑃
than d. There can be no more than 6 such points (because 𝑑 ≤ 𝑑2 )
The efficiency of Closest-Pair Problem

Algorithm Analysis:
Summary

• Divide-and-conquer is a general algorithm design technique that solves a


problem by dividing it into several smaller subproblems of the same type (ideally,
of about equal size), solving each of them recursively, and then combining their
solutions to get a solution to the original problem.
• Running time T (n) of many divide-and-conquer algorithms satisfies the
recurrence T (n) = aT (n/b) + f (n). The Master Theorem establishes the order of
growth of its solutions.
• Mergesort is works by dividing an input array into two halves, sorting them
recursively, and then merging the two sorted halves to get the original array
sorted. The algorithm’s time efficiency is in 0(n log n) in all cases, with the
number of key comparisons being very close to the theoretical minimum. Its
principal drawback is a significant extra storage requirement.
Summary

• Quicksort works by partitioning its input elements according to their value relative
to some preselected element. Quicksort is noted for its superior efficiency among
n log n algorithms for sorting randomly ordered arrays but also for the quadratic
worst-case efficiency.
• The classic traversals of a binary tree—preorder, inorder, and postorder— and
similar algorithms that require recursive processing of both left and right subtrees
can be considered .
• Strassen’s algorithm needs only seven multiplications to multiply two 2 × 2
matrices. By exploiting the divide-and-conquer technique, this algorithm can
multiply two n × n matrices with about n2.807 multiplications.
• The divide-and-conquer technique can be successfully applied the closest-pair
problem.
List of Readings

➢ Required Reading :
Chapter 5 from Anany Levitin (2012) . Introduction to the
Design and Analysis of Algorithms. Pearson, 2012, 3rd edition

➢ Recommending Reading :
Chapter 4 from Thomas H. Cormen, Charles E. Leiserson, Ronald L.
Rivest, Clifford Stein ( 2022 ). Introduction to Algorithms. MIT Press.
.Fourth edition

Chapter 8 & Chapter 11from Michael T.Goodrich, Roberto Tamassia (2014).


Algorithm Design and Applications. First Edition Wiley.
Thank You
‫ر‬
‫الجامعة السعودية االلكتونية‬
‫ر‬
‫االلكتونية‬ ‫الجامعة السعودية‬

‫‪26/12/2021‬‬
College of Computing and Informatics

Design and Analysis of Algorithms


Design and Analysis of Algorithms

Module 6
Transform-and-Conquer
Week Learning Outcomes

Describe Transform-and-Conquer and their various types

Express the presorting and Gaussian Elimination

Recognize Balanced Search Trees ( AVL) tree & its properties

233
Topic Outline

➢ Transform-and-Conquer Explanation
➢ Presorting
➢ Gaussian Elimination
➢ Balanced Search Trees (AVL trees and its properties )

234
Transform-and-Conquer Explanation
Transform-and-Conquer Explanation

• A group of techniques based on the idea of transformation to a problem


that is easier to solve.
• Work as two-stage procedures:
First Stage Transformation : the problem’s instance is modified to be, for one
reason or another, more amenable to solution.
Second Stage Conquering : where the problem is solved.

236
Transform-and-Conquer Explanation

Transformation variations can be either :

• Instance Simplification : a simpler/more convenient instance [e.g. sorted] of the same


problem
• Representation Change : a different representation of the same instance [e.g. different
data structure]
• Problem Reduction: a different problem for which an algorithm is already available

237
Instance Simplification - Presorting
Instance Simplification - Presorting
Solve a problem’s instance by transforming it into another simpler/easier instance of the
same problem

Presorting
Many problems involving lists are easier when list is sorted,
• e.g. searching
• computing the median (selection problem)
• checking if all elements are distinct (element uniqueness)

Also:
• Topological sorting helps solving some problems for dags.
• Presorting is used in many geometric algorithms [eg close pr]

239
Instance Simplification - Presorting
How can we fast the sorting process ?

• Efficiency of algorithms involving sorting depends on efficiency of sorting

• Theorem: log2 n!  n log2 n comparisons are necessary in the worst case to
sort a list of size n by any comparison-based algorithm.

• Note: About n log2 n comparisons are also sufficient to sort array of size n (by
mergesort and also quick sort

240
Searching with Presorting

Problem: Search for a given K in A[0..n-1]

Presorting-based algorithm:
Stage 1 Sort the array by an efficient sorting algorithm
Stage 2 Apply binary search

Efficiency: Θ(nlog n) + O(log n) = Θ(nlog n)

• Is this Good or bad?


• Why do we have our dictionaries, telephone directories, etc.
sorted?

241
Element Uniqueness with Presorting

Presorting-based algorithm
Stage 1: sort by efficient sorting algorithm (e.g. mergesort)
Stage 2: scan array to check pairs of adjacent elements

Efficiency: Θ(nlog n) + O(n) = Θ(nlog n)

One solution : Brute force algorithm


Compare all pairs of elements
Efficiency: O(n2)

• Can you think of another algorithms?

242
Element Uniqueness with Presorting

Presorting-based algorithm
Stage 1: sort by efficient sorting algorithm
(e.g. mergesort)
Stage 2: scan array to check pairs of
adjacent elements

• Running time of this algorithm is the sum of


the time spent on sorting and the time
spent on checking consecutive elements

Efficiency: Θ(nlog n) + O(n) = Θ(n log n)

Do the detailed analysis to see how we


obtained this formula ?
243
Element Uniqueness with Presorting

Presorting-based algorithm

• Another consideration to solve this problem is Brute force algorithm


Compare all pairs of elements
But Efficiency: O(n2) not the best of suggested method

Can you think of another algorithms? …

244
Element Uniqueness with Presorting

Presorting-based algorithm

Another consideration to solve this problem is Brute force algorithm


Compare all pairs of elements
But Efficiency: O(n2) not the best of suggested method

Can you think of another algorithms? …

• Hashing – frequently useful


• Closest pair

245
Activity !!!
• In groups
• Search how hashing is used in
finding uniqness elements in
an array ?

246
Instance simplification – Gaussian Elimination

Systems of two linear equations in two unknowns would be formulated as

This is easily be solved

But for given: A system of n linear equations in n unknowns with an arbitrary


coefficient matrix.

How can be is easily solved ?

247
Instance simplification – Gaussian Elimination

Transform to: An equivalent system of n linear equations in n unknowns with an


upper triangular coefficient matrix.

Solve the latter by substitutions starting with the last equation and moving up to
the first one

248
Instance simplification – Gaussian Elimination

More Explanation:
The transformation is accomplished by a sequence of elementary operations on
the system’s coefficient matrix (which don’t change the system’s solution):

for i ←1 to n-1 do
replace each of the subsequent rows (i.e., rows i+1, …, n) by
a difference between that row and an appropriate multiple of the i-th row to
make the new coefficient in the i-th column of that row 0

249
Example of Gaussian Elimination
Solve : 2x1 - 4x2 + x3 = 6
3x1 - x2 + x3 = 11
x1 + x2 - x3 = -3
Gaussian elimination
2 -4 1 6 2 -4 1 6
3 -1 1 11 row2 – (3/2)*row1 0 5 -1/2 2
1 1 -1 -3 row3 – (1/2)*row1 0 3 -3/2 -6 row3–(3/5)*row2
2 -4 1 6
0 5 -1/2 2
0 0 -6/5 -36/5
Backward substitution
x3 = (-36/5) / (-6/5) = 6
x2 = (2+(1/2)*6) / 5 = 1
x1 = (6 – 6 + 4*1)/2 = 2
250
Pseudocode and Efficiency of Gaussian Elim.
Stage 1: Reduction to an upper-triangular matrix
for i ← 1 to n-1 do
for j ← i+1 to n do
for k ← i to n+1 do
A[j, k] ← A[j, k] - A[i, k] * A[j, i] / A[i, i] //improve!

Stage 2: Back substitutions


for j ← n down to 1 do
t←0
for k ← j +1 to n do
t ← t + A[j, k] * x[k]
x[j] ← (A[j, n+1] - t) / A[j, j]

Efficiency: Θ(n3) + Θ(n2) = Θ(n3)


251
Searching Problem
Searching Problem

• Problem: Given a (multi)set S of keys and a search key K, find an occurrence


of K in S, if any

• Searching must be considered in the context of [context determines best data


structure]:
o file size (internal vs. external)
o dynamics of data (static vs. dynamic)

• Dictionary operations (dynamic data) [data structure determines performance]:


o find (search)
o insert
o delete

253
Taxonomy of Searching Algorithms
• List searching
o sequential search
o binary search
o interpolation search

• Tree searching
o binary search tree
o balanced binary trees:
- AVL trees
- red-black trees
o multiway trees [balanced]:
- 2-3 trees
-2-3-4 trees, B trees
• Hashing
o open hashing (separate chaining)
o closed hashing (open addressing)
254
Binary Search Tree
• Arrange keys in a binary tree with the binary search tree property:

<K >K

Example : 5,3,1,10,12,7,9
Which could/should be the root? Best/Worst?

255
Dictionary Operations on Binary Search Trees
➢ Searching – straightforward
• Insertion – search for key, insert at leaf where search terminated
• Deletion – 3 cases:
• deleting key at a leaf
• deleting key at node with single child
• deleting key at node with two children
➢ Efficiency depends of the tree’s height: log2 n  h  n-1,
• With height average (random files) be about 3log2 n
• Maximum number of nodes if height k?
• Thus all three operations have
• worst case efficiency: (n)
• average case efficiency: (log n)

256
Balanced Search Trees

Attractiveness of binary search tree is marred by the bad (linear) worst-case


efficiency.

o Two ideas to overcome it (ie keep BST close to balanced) are:


• To restructure BST to maintain balance when an insertion makes the tree “too
unbalanced”
AVL trees
red-black trees

• To allow more than one key per node of a search tree


2-3 trees

257
Balanced trees: AVL trees

Definition An AVL tree is a binary search tree in which, for every node, the
difference between the heights of its left and right subtrees, called the balance
factor, is at most 1 (with the height of an empty tree defined as -1)

Explain?
Tree (a) is an AVL tree; tree (b) is not an AVL tree
258
Rotations

If a key insertion violates the balance requirement at some node, the subtree
rooted at that node is transformed via one of the four rotations.

(The rotation is always performed for a subtree rooted at an “unbalanced”


node closest to the new leaf.)

259
Rotation Types (Single R-rotation)
Single right rotation, or R-rotation:

Imagine rotating the edge connecting the root and its left child in the binary
tree in right side.
Note:
This rotation is performed after a new key is inserted into the left subtree of
the left child of a tree whose root had the balance of +1 before the insertion.

Example of r-rotation
260
Symmetric Single L-rotation
Single lift rotation, or L-rotation:
The mirror image of the single R-rotation.

Note:
It is performed after a new key is inserted into the right subtree of the right
child of a tree whose root had the balance of −1 before the insertion.

261
Rotation Types (Double LR-rotation)
The double left-right rotation (LR- rotation)
A combination of two rotations: we perform the L-rotation of the left subtree of
root r , followed by the R-rotation of the new tree rooted at r

Note: It is performed after a new key is inserted into the right subtree of the
left child of a tree whose root had the balance of +1 before the insertion.

262
Symmetric Double RL-rotation
The double right-left rotation (RL- rotation)
Mirror image for double left-right rotation

Think of it as an exercise?

263
AVL tree construction - an example

Construct an AVL tree for the list


5, 6, 8, 3, 2, 4, 7
by successive insertions

264
Analysis of AVL trees

h  1.4404 log2 (n + 2) - 1.3277

average height: 1.01 log2n + 0.1 for large n (found empirically)

Search and insertion are O(log n)

Deletion is more complicated but is also O(log n)

Disadvantages:
• frequent rotations
• complexity

A similar idea: red-black trees (height of subtrees is allowed to differ by up to a


factor of 2)
265
Activity !!!
In groups
• Think of application for AVL
trees

266
Summary

• Transform-and-conquer is a group of techniques based on the idea of


transformation to a problem that is easier to solve.
• There are three principal varieties of the transform-and-conquer strategy:
instance simplification, representation change, and problem reduction.
• Instance simplification is transforming an instance of a problem to an instance of
the same problem with some special property that makes the problem easier to
solve. List presorting, Gaussian elimination, and rotations in AVL trees are good
examples of this strategy
• Representation change implies changing one representation of a problem’s
instance to another representation of the same instance .

267
List of Readings

➢ Required Reading :
Chapter 6 from Anany Levitin (2012) . Introduction to the
Design and Analysis of Algorithms. Pearson, 2012, 3rd edition

➢ Recommending Reading :
Chapter 6 & Chapter 12 from Thomas H. Cormen, Charles E. Leiserson, Ronald L.
Rivest, Clifford Stein ( 2022 ). Introduction to Algorithms. MIT Press.Fourth edition

Chapter 4 from Michael T.Goodrich, Roberto Tamassia (2014).


Algorithm Design and Applications. First Edition Wiley.

268
Thank You
‫ر‬
‫الجامعة السعودية االلكتونية‬
‫ر‬
‫االلكتونية‬ ‫الجامعة السعودية‬

‫‪26/12/2021‬‬
College of Computing and Informatics

Design and Analysis of Algorithms


Design and Analysis of Algorithms

Module 7
Transform-and-Conquer
Week Learning Outcomes

Apply the heapsorting in problem domain

Recognize problem reduction

273
Topic Outline

➢ Heaps and Heapsort


➢ Problem Reduction

274
Heaps and Heapsort
Heaps and Heapsort

Definition
• A heap is a binary tree with keys at its nodes (one key per node) such that:
• It is essentially complete,
• i.e., all its levels are full except possibly the last level, where only some
rightmost keys may be missing

• The key at each node is ≥ all keys in its children (and descendents)
276
Illustration of the heap’s definition

10 10 10

5 7 5 7 5 7

4 2 1 2 1 6 2 1

a heap not a heap not a heap

Note: Heap’s elements are ordered top down (along any path down from its
root), but they are not ordered left to right
277
Some Important Properties of a Heap

• Given n, there exists a unique binary tree (structure) with n nodes that is
essentially complete, with h = log2 n

• The root contains the largest key

• Every subtree rooted at every node of a heap is also a heap

• A heap can easily be represented as an array (and usually is). Look at the
following example, and then explain why the array representation always
works.

278
Heap’s Array Representation

Store heap’s elements in an array (whose elements indexed, for convenience, 1


to n) in top-down left-to-right order

Example: 9
1 2 3 4 5 6
5 3 9 5 3 1 4 2

1 4 2

Left child of node j is at 2j


Right child of node j is at 2j+1
Parent of node j is at j/2
Parental (ie interior) nodes are in the first n/2 locations
279
Heap Construction (bottom-up)

Step 0: Initialize structure (ie array) with keys in order give

Step 1:
Starting with the last (rightmost) parental (ie interior) node, fix the heap rooted at
it, if it doesn’t satisfy the
heap condition: keep exchanging it with its largest child until the heap
condition holds

Step 2:
Repeat Step 1 for the preceding parental node
Bottom up: Adding nodes to heap from bottom to top, pushing elements down,
as needed

Step 1 called Heapify: 280


Example of Heap Construction [Bottom Up]

Construct a heap for the list 2, 9, 7, 6, 5, 8 . Insert elements into the array.

2 2 2

9 7 > 9 8 9 8

6 5 8 6 5 7 6 5 7

2 9 9

9 8 > 2 8 > 6 8

6 5 7 6 5 7 2 5 7

Worst case? Number of comparisons for a node??


281
Pseudocode of bottom-up heap construction

282
Insert a New Element into a Heap

• Basic operation in building heap in Top Down order


• Insert the new element at last position in heap (size++)
• Compare new element with its parent and, if it violates heap condition,
exchange them
• Continue comparing the new element with nodes up the tree until the heap
condition is satisfied

283
Insert a New Element into a Heap

• Example: Insert key 10 into heap of size 6

9 9 10

6 8 > 6 10 > 6 9

2 5 7 10 2 5 7 8 2 5 7 8

What order is this?


Efficiency: O(log n)
284
Example of Heap Construction [Top Down]

Construct a heap for the list 2, 9, 7, 6, 5, 8 [Start with elements in the array.
Process interior nodes from L to R. Add nodes from top to bottom. Add each
node, and push up as needed.
2
29
9/ 2
9/ 2 7
9/ 2 7/ 6
9/ 6 7/ 2; 9/ 6 7/ 2 5; 9/ 6 7/ 2 5, 8; 9/ 6 8/ 2 5, 7
Is this a heap??
285
Heapsort

Stage 1:
Construct a heap for a given list of n keys

Stage 2:
Repeat operation of root removal n-1 times:
Exchange keys in the root and in the last (rightmost) leaf
Decrease heap size by 1
If necessary, repeatedly swap new root (node) with larger child until
the heap condition again holds

286
Example of Sorting by Heapsort

Sort the list 2, 9, 7, 6, 5, 8 by heapsort

Stage 1 (heap construction). Stage 2 (root/max removal)


1 9 7 6 5 8 9 6 8 2 5 7
2 9 8 6 5 7 7 6 8 2 5|9
2 9 8 6 5 7 8 6 7 2 5|9
9 2 8 6 5 7 5 6 7 2|8 9
9 6 8 2 5 7 7 6 5 2|8 9
2 6 5|7 8 9
6 2 5|7 8 9
5 2|6 7 8 9
5 2|6 7 8 9
2|5 6 7 8 9

287
Analysis of Heapsort

Stage 1: Build heap for a given list of n keys (of height h)


worst-case h-1
C(n) =  2(h-i) 2i = 2 ( n – log2(n + 1))  (n)
i=0
# nodes at level i

Stage 2: Repeat operation of root removal n-1 times (fix heap)


worst-case n-1
C(n) =  2log2 i  (nlogn)
i=1
Both worst-case and average-case efficiency: (nlogn)

In-place: yes
Stability: no (e.g., apply heapsort to 1 1)
288
Problem Reduction
Problem Reduction

This variation of transform-and-conquer solves a problem by a transforming


it into different problem for which an algorithm is already available.

Reduce P to Q = use a solution of Q as part of a solution to P


To solve P, first solve Q then use solution to Q in solution to P
OR
To solve P, write an algorithm that calls Q

To be of practical value, the combined time of the transformation and solving


the other problem should be smaller than solving the problem as given by
another method [eg by brute force].

290
Examples of Solving Problems by Reduction

Computing. Least- common- multiplication lcm(m, n) via computing


Greatest- common-divisor gcd(m, n) !!

P = LCM, Q = GCD
Reduce P to Q: Reduce LCM to GCD
LCM(x, y) = (x * y) / GCD(x, y)
LCM(60, 42) = (60*42) / GCD(60, 42) = 2520 / 6 = 420
60=2*2*3*5. 42=2*3*7. GCD=2*3. LCM=2*2*3*5*7.

Assume * and / are constant time (ie Θ(1)).


Compare performance of Best Algorithms for P & Q.

291
Problem Reduction: Notation

Reduce LCM to GCD:

LCM(x, y) = (x * y) / GCD(x, y)

Notation: P  Q means P reduces to Q.


LCM  GCD.

Does this notation seem backwards?

292
Problem Reduction: Best Alg. Performance

P = LCM, Q = GCD. Reduce P to Q: Reduce LCM to GCD


LCM(x, y) = (x * y) / GCD(x, y)
LCM(60, 42) = (60*42) / GCD(60, 42) = 2520 / 6 = 420

Assume reduction steps are faster than Q (eg * and / are Θ(1)).
Assume best alg for GCD is used. Are these stmts true or false?
- The best alg for LCM can be faster than best alg for GCD.
- The best alg for GCD can be faster than best alg for LCM.
- Examples:
Best LCM: O(n) and Best GCD: O(n^2)
Best GCD: O(n) and Best LCM: O(n^2)

Notation: LCM  GCD


293
More Problem Reduction Examples

• Reduce finding median to ??


• Reduce finding minimum to ??
• Reduction to graph problems (e.g., solving puzzles via state-space
graphs)
• Transforming a maximization problem to a minimization problem and vice
versa
• linear programming
• counting number of paths of length n in a graph by raising the graph’s
adjacency matrix to the n-th power

294
Summary

• Problem reduction calls for transforming a given problem to another problem that
can be solved by a known algorithm.
• A heap is an essentially complete binary tree with keys (one per node) satisfying
the parental dominance requirement. Though defined as binary trees, heaps are
normally implemented as arrays. Heaps are most important for the efficient
implementation of priority queues; they also underlie heapsort.
• Heapsort is a theoretically important sorting algorithm based on arranging
elements of an array in a heap and then successively removing the largest element
from a remaining heap.
• AVL trees are binary search trees that are always balanced to the extent possible
for a binary tree. The balance is maintained by transformations of four types called
rotations. All basic operations on AVL trees are in O(log n)

295
List of Readings

➢ Required Reading :
Chapter 6 from Anany Levitin (2012) . Introduction to the
Design and Analysis of Algorithms. Pearson, 2012, 3rd edition

296
Thank You

You might also like