Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
2 views

DSAP-Lecture 3 - Algorithm Analysis

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

DSAP-Lecture 3 - Algorithm Analysis

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

Algorithm Analysis

Data Structure & Algorithms with Python


Lecture 3
Overview
● Introduction - Motivation, definitions etc.
● Types of Analyses - Best case, worst case and average case
● Various functions used for analysing rate of growth
● Asymptotic analysis:
○ Big-Oh notation
○ Big-Omega notation
○ Big-Theta notation
● Comparison between various kinds of complexities
● Examples of how efficient algorithms can be modified to reduce computational
complexity.
● Summary
Introduction
● What is an Algorithm?
An algorithm is a step-by-step procedure with unambiguous instructions to solve a given problem.

● What is a data structure?


A data structure is a systematic way of organizing and accessing data.

● Why the analysis of Algorithm required?


There are multiple algorithms for doing things. Algorithm analysis helps us in selecting the most
efficient algorithm in terms of time and space consumed.

● What is the Goal of Algorithm Analysis?


To compare algorithms (or solutions) mainly in terms of running time, also other factors (memory
requirement, developer effort etc.).
Experimental Analysis of Algorithms

Experimentally, algorithm is analysed using the following two


quantities:

● Running Time
● Static and Run-time memory requirement.

Run-time is usually found to be proportional to the size of


the input being processed.

Challenges of Experimental Analysis

Input size refers to ● Experimental running time depends on the


hardware and software platform used for
● Size of an array implementation.
● Polynomial degree ● Experiments can be done only on a limited set of
● Number of elements in a matrix test inputs.
● Numbers of bits in a binary representation of input ● An algorithm needs to be fully implemented for
● Vertices and edges in a graph execution and study its running time
experimental.
Objectives of Algorithm Analysis Approach involves two steps:

● Counting Primitive Operations (low-level


Our goal is to develop an approach of analyzing instruction with fixed execution time). E.g:
the efficiency of algorithms that: ○ Assigning an identifier to an object
○ Determining the object associated with an
● Allows us to evaluate relative efficiency of identifier
○ Arithmetic operations
algorithms in a way that is independent of
○ Comparing two numbers
the hardware and software environment. ○ Accessing an element an array or a list
○ Calling a function
● Is performed by studying high-level ○ Returning from a function.
description of the algorithm without need
● Measuring operations as a function of Input size
for implementation.

● Takes into account all possible inputs.


Rate of growth: The rate at which the running
time increases as a function of input size.
Types of Analysis

● Worst Case
○ Defines the input for which the
algorithm takes maximum time
complete
● Best Case
○ Defines the input for which the
algorithm takes minimum time to
complete
● Average Case
○ Provides an average prediction
about the running-time of the
algorithm.
○ Needs to have understanding of
probability distribution of input data.
Seven Functions for analysing rate of growth

Approximation:

Total Cost = cost_of_car + cost_of_bicycle


Total Cost ≈ cost_of_car (approx)
The Constant Function:
Computation time does not change with input size.

Examples:

● Adding a number to the front of an array


● Adding two numbers
● Assigning a value to a variable
● Comparing two numbers
The Logarithmic function:
Computation time increases logarithmically with Logarithmic Rules
input size.

In computer science, we consider base to be 2. Examples


By default, we will assume

Example:

● Finding an element in a sorted array


(Why?)
The Linear Function: The N-Log-N function:
The computation time is increases linearly with
input size.

The function grows a little more rapidly than the


linear function and a lot less rapidly than the
quadratic function.
Example:

● Finding an element in an unsorted array Example:


(Why?)
● Sorting n items by ‘divide-and-conquer’
(mergesort) algorithm - The fastest sorting
algorithm possible.
Nested Loops and Quadratic Function
The quadratic function:

Examples:

● Shortest path between two nodes in a


graph.
● Multiplying two matrices by conventional
means.
● Worst possible sorting algorithm.

● Algorithms using two nested loops will


usually lead to quadratic rate of growth
Cubic Function & Other Polynomials Summations:

Cubic Function

Examples:
Polynomials:

Polynomial functions using this notation


Where d is the degree of polynomial and
can be written as
are the coefficients of the polynomial.
The Exponential function: Geometric Sums:

Where b is the base and n is the exponent.

Exponent Rules:

Examples:

Q. What is the largest number that can


represented in binary notation using n bits?
Examples:
A.
Comparing Growth Rates

Ceiling and Floor Function:

Increasing order of complexity

Increasing rate of growth


Comparative Analysis of various growth rates

Maximum size of a problem that can be


solved in 1 second, 1 minute and 1 hour, for
various running times measured in
microseconds.

● It shows the importance of good algorithm design.

● The handicap of an asymptotically slower algorithm Running the above algorithm on


can not be overcome by using a dramatic speedup in 256 times faster machine
hardware.

● An asymptotically slow algorithm is beaten in the long


run by an asymptotically faster algorithm, even if the
constant factor for the faster algorithm is worse.
Note that the linearly growing runtime has a worse constant factor
compared to quadratic and exponential runtime algorithms
Asymptotic Analysis
● It is an approximate analysis an algorithm complexity
as n tends to infinity.

● We focus on the growth rate of the running time as a


function of the input size n.

● The “Big-Oh” Notation:


def find_max(data):
‘’’
return the max value element
from a nonempty array
‘’’
biggest = data[0] C1
● The big-Oh notation allows us to say that a function
for val in data:
f(n) is “less than or equal to” another function g(n) up If val > biggest C2

to a constant factor and in the asymptotic sense as n biggest = val C1


Return biggest
grows toward infinity.

● “Big-Oh” notations provides the tighter upper bound


of a given function.
Some properties of Big-Oh notation

● If f(n), n>=1 is a polynomial of degree d, Examples:

Then

Justification: for n >= 1,


Asymptotic Analysis with Big-Omega Notation

If

Then

Big-omega notation provides tighter lower


bound of the function.

Example:
Asymptotic Analysis with Big-Theta Notation

We say

if

Or

Example:
● In this case, the upper and lower bound of a
given function is same.

● The rate of growth in the best case and the


worst case will be same and so will be the
average case growth rate.
Some words of Caution

● Constant factors in Big-Oh notation


should not be too large
● Similarly be careful of constants in
exponentials:

● When using the big-Oh notation, we


should at least be somewhat mindful of
the constant factors and lower-order
terms we are “hiding.”
Commonly used summations

Arithmetic Series:

Geometric Series

Harmonic Series:

Others:
Few Examples of Asymptotic Analysis of Algorithms
Finding the maximum of a sequence
Constant time operations:
find_max(data) ~ O(n)
Finding the length of a list
How many times, we might update the “biggest” value?
len(data) ~ O(1)
Expected number of times we update the biggest value is a
accessing an element in a list Harmonic number

Data[j] ~ O(1);
Probability of updating the value of biggest:
k = 1, p = 1, certainty that update will occur the first time
k = 2, p = 1/2 ,
k = 3, p = 1/3
def find_max(data):

p = 1/2

p = 1/3
p=1
biggest = 0
for i in range(len(data)):
If data[i] > biggest:
biggest = data[i] a b c
Examples:
Ex 3:

Ex 1:

Ex 4:

Ex 2:
Running-time analysis for standard components

Type Code Time

Loops for i in range(0,n):


print(i)

Nested Loops for i in range(0,n):


for j in range(0, n):
print(i,j)

Consecutive n = 100
statements for i in range(0,n):
print(i)
for i in range(0,n):
For j in range(0,n):
print(i,j)

If-then-else if n == 1:
Print n
else:
For i in range(0,n)
Print i
Algorithms with Logarithmic growth rate

● It takes constant time to cut the problem


size by a fraction (usually ½).
● Assume that at step k,

Total time is O(log n).


Average Prefixes:

For a given sequence S of n elements, find


another sequence A where A[j] is the average of
elements from S[0] to S[j]

Analyzing computational complexity:

● n = len(s) ~ O(1)
● A = [0] * n ~ O(n)
● Outer loop (counter j) ~ O(n)
● Inner loop (counter i) is executed 1+2+3+
… + n times = n(n+1)/2 ~ O(n^2)
● Total Time = O(1) + O(n) + O(n^2) ~
O(n^2)
Two loops ~
● Initializing variables n and total uses O(1)
time
● Initializing the list A uses O(n) time
● Single for loop: counter j is updated in
O(n) time.
● Body of the loop is executed n times ~
O(n)
● Total running time of prefix_average3() is
in O(n) time.
Three-way Set Disjointness

● There are three sequences of numbers: A,


B and C.
● No individual sequence contains duplicate
values.
● There may be some numbers that are in
two or three of the sequences.
● The three-way set disjointness problem is
to determine if the intersection of the three
sequences is empty, namely, there is no
element

Worst-case run time ~


● If there are no matching elements in
A & B, there is no need to iterate
over C.

● Test condition a == b is evaluated


O(n^2) times.

● There can be maximum n matching


pairs in A & B and hence the loop C
will use O(n^2) time.

● Total time ~ O(n^2)


Element Uniqueness

Given a sequence of n numbers, return True if


all the elements are distinct.

Example 1: Worst-case running time of this


function is proportional to

Example 2:

● sorted() function runs in O(n log n) time.


● The loop runs is O(n) time.
● Total time ~ O(n log n)
Simple Justification Techniques
● The “Contra” Attack
● By Example
Contrapositive: To justify “if p is true, then
“Every element x in a set S has property q is true”, we establish “if q is not true,
P”. then p is not true”.

To disprove such a generic claim, we only Example:


need to produce one particular x from S Hypothesis: Let a and b be integers. If ab
that does not have property P. Such an is even, then a is even or b is even.
instance is called a counterexample.
To justify the claim, consider the
Example: The statement contrapositive: “If a is odd and b is odd,
“ ” then ab is odd”. So, a = 2j + 1, b = 2k+1
for some integer j and k. Then ab = 4jk +
Is false because 2j+2k+1 = 2(2jk+j+k)+1 which is odd.
Hence, the statement is true.

We apply De Morgan’s law here.


De Morgan’s law

Let ab be odd. We wish to show that a is odd


and b is odd.

Let us assume the opposite (De Morgan’s law):


Contradiction: We establish that a statement q a is even or b is even.
is true by first supposing that q is false and then
If a = 2j then ab = 2(jb), that is ab is even. This
showing that this assumption leads to a
is a contradiction as we assumed ab to be odd.
contradiction.
Hence, a is odd. Similarly, b is odd and the
above statement is true.
By reaching such a contradiction, we show that
no consistent situation exists with q being false,
so q must be true.

Example:
Hypothesis: Let a and b be integers. If ab is odd,
then a is odd and b is odd.
Example: Fibonacci Function F(n)
● Induction
Consider the statement where a claim is
being made about an infinite set of
We claim that
numbers.
Base cases:
“q(n) is true for all n >= 1”
(n <= 2), F(1) < 2, F(2) = 2 < (4=2^2)

First we show Induction step:


q(n) is true for n = 1 Suppose the above step is true for all k < n.
q(n) is true for n = 2,3, … k for some Now Show that the hypothesis is true for some k
constant k < j < n.
We justify that the inductive step is true for Proof:
for some j > k if q(j) is true for all j < n.

Then, q(n) is true for all n.

Hence, F(n) is true for all n > 2


Another Example for Induction

Base Case: for n = 1, sum = 1 = 1(1+1)/2

Induction step: n >= 2, Assume that the claim is


true for all k < n. Let k = n-1.

Hence the above statement is true for all n.


Summary
● Absolute Running-time is a good metric for analyzing algorithm performance but
is, hardware dependent and requires full implementation.
● Growth rate of running time with input size is used for analyzing algorithms.
● Asymptotic analysis is carried out without implementation by making asymptotic
approximations as n → \infty.
● Worst-case, best-case and average case analysis.
● Different types of growth rate: constant, linear, quadratic, polynomial,
exponential and logarithmic.
● Various notations for asymptotic analysis: Big-Oh, Omega and Theta
● Various justification techniques for proving or disproving a claim.

You might also like