Algorithm Analysis - Complexity
Algorithm Analysis - Complexity
2
What Is Algorithm Analysis?
How to compare programs with one another?
When two programs solve the same problem but look
different, is one program better than the other?
What criteria are we using to compare them?
Readability?
Efficient?
Why do we need algorithm analysis/complexity ?
Writing a working program is not good enough
The program may be inefficient!
If the program is run on a large data set, then the running time
becomes an issue
3
Data Structures & Algorithm
Data Structures:
A systematic way of organizing and accessing data.
No single data structure works well for ALL purposes.
4
Algorithm Analysis/Complexity
When we analyze the performance of an algorithm, we are
interested in how much of a given resource the algorithm uses
to solve a problem.
The most common resources are time (how many steps it takes
to solve a problem) and space (how much memory it takes).
We are going to be mainly interested in how long our
programs take to run, as time is generally a more precious
resource than space.
5
Efficiency of Algorithms
For example, the following graphs show the execution time, in
milliseconds, against sample size, n of a given problem in different
computers
More
powerful
computer
6
Running-time of Algorithms
In order to compare algorithm speeds experimentally
All other variables must be kept constant, i.e.
independent of specific implementations,
independent of computers used, and,
independent of the data on which the program runs
Involved a lot of work (better to have some theoretical means
of predicting algorithm speed)
7
Example 1
Task:
Complete the sum_of_n() function which calculates the sum of
the first n natural numbers.
Arguments: an integer
Returns: the sum of the first n natural numbers
Cases:
sum_of_n(5)
15
sum_of_n(100000)
5000050000
8
Algorithm 1
sum_of_n Set the_sum = 0
Return the_sum
time_start = time.time()
The timing calls
the_sum = 0 embedded before and
for i in range(1,n+1): after the summation
the_sum = the_sum + I to calculate the time
required for the
time_end = time.time() calculation
time_taken = time_end - time_start
9
Algorithm 2
sum_of_n_2 Set the_sum = 0
Return the_sum
time_start = time.clock()
the_sum = 0
the_sum = (n * (n+1) ) / 2
time_end = time.clock()
time_taken = time_end - time_start)
10
Experimental Result
Using 4 different values for n: [10000, 100000,1000000,
10000000] n sum_of_n sum_of_n_2
(for loop) (equation)
10000 0.0033 0.00000181
100000 0.0291 0.00000131
1000000 0.3045 0.00000107
10000000 2.7145 0.00000123
Time
Time increase as we NO impacted by the
Consuming increase the value of n. number of integers
Process! being added.
11
Advantages of Learning Analysis
Predict the running-time during the design phase
The running time should be independent of the type of
input
The running time should be independent of the
hardware and software environment
Save your time and effort
The algorithm does not need to be coded and debugged
Help you to write more efficient code
12
Basic Operations
We need to estimate the running time as a function of
problem size n.
A primitive Operation takes a unit of time.The actual
length of time will depend on external factors such as the
hardware and software environment
Each of these kinds of operation would take the same amount of
time on a given hardware and software environment
Assigning a value to a variable
Calling a method.
Performing an arithmetic operation.
Comparing two numbers.
Indexing a list element.
Returning from a function
13
Example 2A
Example: Calculating a sum of first 10 elements in the list
def count1(numbers):
1 assignment -> the_sum = 0
1 assignment -> index = 0
11 comparisons -> while index < 10:
10 plus/assignments ->
10 plus/assignments -> the_sum = the_sum + numbers[index]
1 return -> index += 1
return the_sum
Total = 34 operations
14
Example 2B
Example: Calculating the sum of elements in the list.
def count2(numbers):
n = len(numbers)
1 assignment ->
the_sum = 0
1 assignment ->
1 assignment -> index = 0
n +1 comparisons -> while index < n:
n plus/assignments -> the_sum = the_sum + numbers[index]
n plus/assignments ->
1 return
index += 1
return the_sum
Total = 3n + 5 operations
We need to measure an algorithm’s time requirement as a
function of the problem size, e.g. in the example above the
problem size is the number of elements in the list.
15
Problem size
Performance is usually measured by the rate at which the
running time increases as the problem size gets bigger,
ie. we are interested in the relationship between the running time
and the problem size.
It is very important that we identify what the problem size is.
For example, if we are analyzing an algorithm that processes a list, the
problem size is the size of the list.
In many cases, the problem size will be the value of a
variable, where the running time of the program
depends on how big that value is.
16
Exercise 1
How many operations are required to do the following tasks?
17
Example 3
18
Growth Rate Function – A or B?
19
Growth Rate Function – A or B?
For smaller values of n, the differences between algorithm A
(n2/5) and algorithm B (5n) are not very big. But the
differences are very evident for larger problem sizes such as
for n > 1,000,000
2 * 1011 Vs 5 * 106
20
Big O - Definition
21
Big – O Notation
We use Big-O notation (capital letter O) to specify the
order of complexity of an algorithm
e.g., O(n2 ) , O(n3 ) , O(n ).
If a problem of size n requires time that is directly
proportional to n, the problem is O(n) – that is, order n.
If the time requirement is directly proportional to n2, the
problem is O(n2), etc.
22
Big-Oh Notation (Formal Definition)
Given functions and we say that is O( )
if there are positive constants, c and n0, such that
c for every integer n n0.
10,000
3n
Example: 2n + 10 is O(n)
1,000 2n+10
2n + 10 cn
n
(c − 2) n 10 100
n 10/(c − 2)
Pick c = 3 and n0 = 10 10
1
1 10 100 1,000
n
23
Examples
Properties of Big-O
There are three properties of Big-O
Ignore low order terms in the function (smaller terms)
O(f(n)) + O(g(n)) = O(max of f(n) and g(n))
Ignore any constants in the high-order term of the function
C* O(f(n)) = O(f(n))
Combine growth-rate functions
O(f(n)) * O(g(n)) = O(f(n)*g(n))
O(f(n)) + O(g(n)) = O(f(n)+g(n))
25
Ignore low order terms
f(n) = n2 + 100n + log10n + 1000
Consider the function:
For small values of n the last term, 1000,dominates.
When n is around 10, the terms 100n + 1000 dominate.
When n is around 100, the terms n2 and 100n dominate.
When n gets much larger than 100, the n2 dominates all others.
So it would be safe to say that this function is O(n2) for values of n > 100
Consider another function:
f(n) = n3 + n2 + n + 5000
Big-O is O(n3)
And consider another function:
f(n) = n + n2 + 5000
Big-O is O(n2)
26
Ignore any Constant Multiplications
Consider the function:
f(n) = 254 * n2 + n
Big-O is O(n2)
Consider another function:
f(n) = n / 30
Big-O is O(n)
And consider another function:
f(n) = 3n + 1000
Big-O is O(n)
27
Combine growth-rate functions
Consider the function:
f(n) = n * log n
Big-O is O(n3)
28
Exercise 2
What is the Big-O performance of the following growth functions?
T(n) = n + log(n) ?
T(n) = n4 + n*log(n) + 300n3 ?
29
Best, average & worst-case complexity
In some cases, it may need to consider the best,worst and/or
average performance of an algorithm
For example, if we are required to sort a list of numbers an
ascending order
Worst-case:
if it is in reverse order
Best-case:
if it is already in order
Average-case
Determine the average amount of time that an algorithm requires to solve problems of
size n
More difficult to perform the analysis
Difficult to determine the relative probabilities of encountering various problems of a
given size
Difficult to determine the distribution of various data values
30
Calculating Big-O
Rules for finding out the time complexity of a piece of
code
Straight-line code
Loops
Nested Loops
Consecutive statements
If-then-else statements
Logarithmic complexity
31
Rules
Rule 1: Straight-line code
Big-O = Constant time O(1)
Does not vary with the size of the input
Example: x = a + b
Assigning a value to a variable i = y[2]
Performing an arithmetic operation.
Indexing a list element.
Rule 2: Loops
The running time of the statements inside the loop (including
tests) times the number of iterations
Example: for i in range(n):
Executed
Constant time * n n times print(i)
= c * n = O(n) Constant
time
32
Rules (con’t)
Rule 3: Nested Loop
Analyze inside out.Total running time is the product of the sizes
of all the loops. Outer loop:
Example: Executed n times
for i in range(n):
constant * (inner loop: n)*(outer loop:n) for j in range(n):
Total time = c * n * n = c*n2 = O(n2) k = i + j
Inner loop:
Executed n times
Rule 4: Consecutive statements
Add the time complexities of each statement
Example:
Constant time + n times * constant time
Constant
c0 + c1n time
Big-O = O(f(n) + g(n)) x = x + 1
= O( max ( f(n) + g(n))) Executed for i in range(n):
n times m = m + 2;
= O(n)
33
Rules (con’t)
Rule 5: if-else statement
Worst-case running time: the test, plus either the if part or
the else part (whichever is the larger).
Example:
c0 + Max(c1, (n * (c2 + c3)))
Total time = c0 * n(c2 + c3) = O(n)
Assumption:
The condition can be evaluated in constant time. If it is not, we need to
add the time to evaluate the expression.
34
Rules (con’t)
Rule 6: Logarithmic
An algorithm is O(log n) if it takes a constant time to cut the problem
size by a fraction (usually by ½)
Example:
Finding a word in a dictionary of n pages
Look at the centre point in the dictionary
Is word to left or right of centre?
Repeat process with left or right part of dictionary until the word is found
Example: size = n
while size > 1:
// O(1) stuff
size = size / 2
Size: n, n/2, n/4, n/8, n/16, . . . 2, 1
If n = 2K, it would be approximately k steps.The loop will execute log k in
the worst case (log2n = k). Big-O = O(log n)
Note: we don’t need to indicate the base.The logarithms to different
bases differ only by a constant factor.
35
Hypothetical Running Time
The running time on a hypothetical computer that computes 106 operations
per second for varies problem sizes
Notation n
10 102 103 104 105 106
O(1) Constant 1 µsec 1 µsec 1 µsec 1 µsec 1 µsec 1 µsec
O(n2) Quadratic 100 µsec 10 msec 1 sec 1.7 min 16.7 min 11.6 days
O(n3) Cubic 1 msec 1 sec 16.7 min 11.6 days 31.7 years 31709
years
O(2n) Exponential 10 msec 3e17 years
36
Comparison of Growth Rate
37
Constant Growth Rate - O(1)
Time requirement is constant and, therefore, independent of
the problem’s size n.
def rate1(n):
s = "SWEAR"
for i in range(25):
print("I must not ", s)
38
Logarithmic Growth Rate - O(log n)
Increase slowly as the problem size increases
If you square the problem size, you only double its time
requirement
The base of the log does not affect a log growth rate, so you
can omit it.
def rate2(n):
s = "YELL"
i = 1
while i < n:
print("I must not ", s)
i = i * 2
39
Linear Growth Rate - O(n)
The time increases directly with the sizes of the problem.
If you square the problem size, you also square its time
requirement
def rate3(n):
s = "FIGHT"
for i in range(n):
print("I must not ", s)
40
n* log n Growth Rate - O(n log(n))
The time requirement increases more rapidly than a linear
algorithm.
Such algorithms usually divide a problem into smaller
problem that are each solved separately.
def rate4(n):
s = "HIT"
for i in range(n):
j = n
while j > 1:
print("I must not ", s)
j = j // 2
41
Quadratic Growth Rate - O(n2)
The time requirement increases rapidly with the size of the
problem.
Algorithms that use two nested loops are often quadratic.
def rate5(n):
s = "LIE"
for i in range(n):
for j in range(n):
print("I must not ", s)
42
Cubic Growth Rate - O(n3)
The time requirement increases more rapidly with the size of the
problem than the time requirement for a quadratic algorithm
Algorithms that use three nested loops are often quadratic and
are practical only for small problems.
def rate6(n):
s = "SULK"
for i in range(n):
for j in range(n):
for k in range(n):
print("I must not ", s)
43
Exponential Growth Rate - O(2n)
As the size of a problem increases, the time requirement
usually increases too rapidly to be practical.
def rate7(n):
s = "POKE OUT MY TONGUE"
for i in range(2 ** n):
print("I must not ", s)
44
Exercise 3
What is the Big-O of the following statements?
Executed for i in range(n): Executed
n times for j in range(10): 10 times
print (i,j)
Constant time
The first set of nested loops is O(n2) and the second loop is O(n).This is O(max(n2,n))
Big-O =
?
45
Exercise 3
What is the Big-O of the following statements?
for i in range(n):
for j in range(i+1, n):
print(i,j)
46
Performance of Python Data Structures
We have a general idea of
Big-O notation and
the differences between the different functions,
Now, we will look at the Big-O performance for the
operations on Python lists and dictionaries.
It is important to understand the efficiency of these
Python data structures
In later chapters we will see some possible
implementations of both lists and dictionaries and how the
performance depends on the implementation.
47
Review
Python lists are ordered sequences of items.
Specific values in the sequence can be referenced using
subscripts.
Python lists are:
dynamic.They can grow and shrink on demand.
heterogeneous, a single list can hold arbitrary data
types.
mutable sequences of arbitrary objects.
48
List Operations
Using operators:
my_list = [1,2,3,4]
True
print (2 in my_list)
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0]
zeroes = [0] * 20
print (zeroes)
49
List Operations
Using Methods:
50
Examples
[3, 1, 4, 1, 5, 9, 2]
Examples:
my_list = [3, 1, 4, 1, 5, 9]
[1, 1, 2, 3, 4, 5, 9]
my_list.append(2)
my_list.sort() [9, 5, 4, 3, 2, 1, 1]
my_list.reverse()
2
print (my_list.index(4)) Index of the first
occurrence of the
parameter
my_list.insert(4, "Hello")
print (my_list)
[9, 5, 4, 3, 'Hello', 2, 1, 1]
print(my_list.pop(3)) 3
print (my_list)
[9, 5, 4, 'Hello', 2, 1]
51
List Operations
The del statement
Remove an item from a list given its index instead of its value
Used to remove slices from a list or clear the entire list
52
Big-O Efficiency of List Operators
53
O(1) - Constant
Operations for indexing and assigning to an index
position
Big-O = O(1)
It takes the same amount of time no matter how large
the list becomes.
i.e. independent of the size of the list
54
Inserting elements to a List
There are two ways to create a longer list.
Use the append method or the concatenation operator
55
4 Experiments
Four different ways to generate a list of n numbers starting
with 0. for i in range(n):
my_list = my_list + [i]
Example 1:
Using a for loop and create the list by concatenation
Example 2: for i in range(n):
Using a for loop and the append method my_list.append(i)
Example 3:
my_list = [i for i in range(n)]
Using list comprehension
Example 4:
Using the range function wrapped by a call to the list constructor.
my_list = list(range(n))
56
The Result
From the results of our experiment: Append: Big-O is O(1)
Concatenation: Big-O is O(k)
for i in range(n):
my_list = list(range(n))
my_list.append(i
)
57
Pop() vs Pop(0)
From the results of our experiment:
As the list gets longer and longer the time it takes to pop(0) also
increases
the time for pop stays very flat.
pop(0): Big-O is O(n)
pop(): Big-O is O(1)
Why?
pop(0)
pop()
58
Pop() vs Pop(0)
pop():
Removes element from the end of the list
pop(0)
Removes from the beginning of the list.
Big-O is O(n) as we will need to shift all elements from space to
the beginning of the list
12 3 44 100 5 … 18
3 44 100 5 … 18
59
Exercise 4
Which of the following list operations is not (1)?
1. list.pop(0)
2. list.pop()
3. list.append() ?
4. list[10]
60
Introduction
Dictionaries store a mapping between a set of keys and a set
of values
Keys can be any immutable type.
Values can be any type
A single dictionary can store values of different types
You can define, modify, view, lookup or delete the key-value
pairs in the dictionary
Dictionaries are unordered
Note:
Dictionaries differ from lists in that you can access items in a
dictionary by a key rather than a position.
61
Examples:
DesMoines
{'Wisconsin':'Madison','Iowa':'DesMoines',
'Utah':'SaltLakeCity'}
4
Sacramento is the capital of California
Madison is the capital of Wisconsin
DesMoines is the capital of Iowa
SaltLakeCity is the capital of Utah
62
Big-O Efficiency of Operators
Table 2.3
63
Contains between lists and dictionaries
From the results
The time it takes for the contains operator on the list grows
linearly with the size of the list.
The time for the contains operator on a dictionary is constant
even as the dictionary size grows
Lists, big-O is O(n)
Dictionaries, big-O is O(1)
Lists
Dictionaries
64
Quiz
65
Summary
Complexity Analysis measure an algorithm’s time requirement as a
function of the problem size by using a growth-rate function.
It is an implementation-independent way of measuring an algorithm
Complexity analysis focuses on large problems
Worst-case analysis considers the maximum amount of work an
algorithm will require on a problem of a given size
Average-case analysis considers the expected amount of work that it
will require.
Generally we want to know the worst-case running time.
It provides the upper bound on time requirements
We may need average or the best case
Normally we assume worst-case analysis, unless told otherwise.
66