Lecture 10
Lecture 10
Lecture 10
ING
PROGRAM
EFFICIENCY: 1
(download slides and .py files and follow
along!) 6.0001 LECTURE 10
6.0001 LECTURE 10 1
Today
Measuring orders of growth of algorithms
Big “Oh” notation
Complexity classes
6.0001 LECTURE 10 2
WANT TO UNDERSTAND
EFFICIENCY OF
PROGRAMS
computers are fast and getting faster – so maybe efficient
programs don’t matter?
◦but data sets can be very large (e.g., in 2014, Google served
30,000,000,000,000 pages, covering 100,000,000 GB – how long to
search brute force?)
◦thus, simple solutions may simply not scale with size in acceptable
manner
how can we decide which option for program is most
efficient?
6.0001 LECTURE 10 4
HOW TO EVALUATE
EFFICIENCY OF
PROGRAMS
measure with a timer
count the operations
abstract notion of order of growth
6.0001 LECTURE 10 5
TIMING A PROGRAM
use time module
recall that import time
importing means to
bring in that class def c_to_f(c):
into your own file return c*9/5 + 32
6.0001 LECTURE 10 6
TIMING PROGRAMS IS
INCONSISTENT
GOAL: to evaluate different algorithms
running time varies between algorithms
running time varies between implementations
running time varies between computers
running time is not predictable based on small
inputs
6.0001 LECTURE 10 7
COUNTING OPERATIONS
assume these steps take def c_to_f(c):
constant time: return c*9.0/5 + 32
• mathematical operations def mysum(x):
• comparisons total = 0
• assignments for i in range(x+1):
total += i
• accessing objects in memory return total
•then count the number of
operations executed as mysum 1+3x ops
function of size of input
6.0001 LECTURE 10 8
COUNTING OPERATIONS
IS BETTER, BUT STILL…
GOAL: to evaluate different algorithms
count depends on algorithm
count depends on implementations
count independent of computers
no clear definition of which operations to count
6.0001 LECTURE 10 10
STILL NEED A BETTER
WAY
Going to focus on idea of counting operations in an
algorithm, but not worry about small variations in
implementation (e.g., whether we take 3 or 4 primitive
operations to execute the steps of a loop)
Going to focus on how algorithm performs when size
of problem gets arbitrarily large
Want to relate time needed to complete a
computation, measured this way, against the size of
the input to the problem
Need to decide what to measure, given that actual
number of steps may depend on specifics of trial
6.0001 LECTURE 10 11
NEED TO CHOOSE WHICH
INPUT TO USE TO EVALUATE
A FUNCTION
want to express efficiency in terms of size of input, so
need to decide what your input is
could be an integer
-- mysum(x)
could be length of list
-- list_sum(L)
you decide when multiple parameters to a function
-- search_for_elmt(L, e)
6.0001 LECTURE 10 12
DIFFERENT INPUTS
CHANGE HOW THE
PROGRAM RUNS
a function that searches for an element in a list
def search_for_elmt(L, e):
for i in L:
if i == e:
return True
return False
when e is first element
in the list BEST CASE
when e is not in list
WORST CASE
when look through about half of the elements in
list AVERAGE CASE
6.0001 LECTURE 10 13
want to measure this behavior in a general way
BEST, AVERAGE, WORST
CASES
suppose you are given a list L of some length len(L)
best case: minimum running time over all possible inputs
of a given size, len(L)
• constant for search_for_elmt
• first element in any list
average case: average running time over all possible inputs
of a given size, len(L)
• practical measure
worst case: maximum running time over all possible inputs
of a given size, len(L)
• linear in length of list for search_for_elmt
• must search entire list and not find it
6.0001 LECTURE 10 14
ORDERS OF GROWTH
Goals:
want to evaluate program’s efficiency when input is very big
want to express the growth of program’s run time as input
size grows
want to put an upper bound on growth – as tight as
possible
do not need to be precise: “order of” not “exact” growth
we will look at largest factors in run time (which section of
the program will take the longest to run?)
thus, generally we want tight upper bound on growth, as
function of size of input, in worst case
6.0001 LECTURE 10 15
MEASURING ORDER OF
GROWTH: BIG OH
NOTATION
Big Oh notation measures an upper bound on the
asymptotic growth, often called order of growth
6.0001 LECTURE 10 16
EXACT STEPS vs O()
def fact_iter(n):
"""assumes n an int >= 0"""
answer = 1
while n > 1:
answer *= n
n -= 1
return answer
computes factorial
number of steps:
worst case asymptotic complexity:
• ignore additive constants
• ignore multiplicative constants
6.0001 LECTURE 10 17
WHAT DOES O(N)
MEASURE?
Interested in describing how amount of time needed
grows as size of (input to) problem grows
Thus, given an expression for the number of
operations needed to compute an algorithm, want to
know asymptotic behavior as size of problem gets large
Hence, will focus on term that grows most rapidly in a
sum of terms
And will ignore multiplicative constants, since want to
know how rapidly time required increases as increase
size of input
6.0001 LECTURE 10 18
SIMPLIFICATION
EXAMPLES
drop constants and multiplicative factors
focus on dominant terms
O(n )
2 : n2 + 2n + 2
O(n )
2 : n2 + 100000n +
O(n)
31000
n) : log(n) + n + 4
O(n log
O(3 )
n : 0.0001*n*log(n) + 300n
: 2n30 + 3n
6.0001 LECTURE 10 19
TYPES OF ORDERS OF
GROWTH
6.0001 LECTURE 10 20
ANALYZING PROGRAMS
AND THEIR COMPLEXITY
combine complexity classes
• analyze statements inside functions
• apply some rules, focus on dominant term
Law of Addition for O():
• used with sequential statements
• O(f(n)) + O(g(n)) is O( f(n) + g(n) )
• for example,
for i in
range(n):
print('a')
for j in
range(n*n):
print('b') 6.0001 LECTURE 10 21
is O(n) + O(n*n) =
ANALYZING PROGRAMS
AND THEIR COMPLEXITY
combine complexity classes
• analyze statements inside functions
• apply some rules, focus on dominant term
Law of Multiplication for O():
• used with nested statements/loops
• O(f(n)) * O(g(n)) is O( f(n) * g(n) )
• for example,
for i in range(n):
for j in
range(n):
print('a')
is O(n)*O(n) = O(n*n) = O(n2) because the outer loop goes n
times and the inner loop goes n times for every outer loop22iter.
6.0001 LECTURE 10
COMPLEXITY CLASSES
O(1) denotes constant running time
O(log n) denotes logarithmic running time
O(n) denotes linear running time
O(n log n) denotes log-linear running time
O(nc) denotes polynomial running time (c is
a constant)
O(cn) denotes exponential running time (c is a
constant being raised to a power based on size of
input)
6.0001 LECTURE 10 23
COMPLEXITY CLASSES
ORDERED LOW TO HIGH
O(1) : constant
O(log n) : logarithmic
O(n) linear
:
loglinear
O(n log n):
polynomial
O(nc)
: exponential
6.0001 LECTURE 10 24
O(c )
n
:
COMPLEXITY GROWTH
CLASS n=10 = 100 = 1000 = 1000000
O(1) 1 1 1 1
O(log n) 1 2 3 6
6.0001 LECTURE 10 25
LINEAR COMPLEXITY
Simple iterative loop algorithms are typically linear in
complexity
6.0001 LECTURE 10 26
LINEAR SEARCH
ON UNSORTED LIST
def linear_search(L, e):
found = False
for i in
if e == L[i]:
range(len(L)):
found = True
return found
◦base + 4*i
if list is
heterogeneou
s◦references to other objects
◦ indirection
…
6.0001 LECTURE 10 28
LINEAR SEARCH
ON SORTED LIST
def search(L, e):
for i in range(len(L)):
if L[i] == e:
return True
if L[i] > e:
return
return False
False
must only look until reach a number greater than e
O(len(L)) for the loop * O(1) to test if e == L[i]
overall complexity is O(n) – where n is len(L)
NOTE: order of growth is same, though run time may
differ for two search methods
6.0001 LECTURE 10 29
LINEAR COMPLEXITY
searching a list in sequence to see if an element is present
add characters of a string, assumed to be composed of
decimal digits
def addDigits(s):
val = 0
for c in s:
val += int(c)
return val
O(len(s))
6.0001 LECTURE 10 30
LINEAR
COMPLEXITY
complexity often depends on number of iterations
def fact_iter(n):
prod = 1
for i in range(1, n+1):
prod *= i
return prod
number of times around loop is n
number of operations inside loop is a constant (in this case, 3 –
set i, multiply, set prod)
◦O(1 + 3n + 1) = O(3n + 2) = O(n)
overall just O(n)
6.0001 LECTURE 10 31
NESTED LOOPS
simple loops are linear in complexity
what about loops that have loops within them?
6.0001 LECTURE 10 32
QUADRATIC COMPLEXITY
determine if one list is subset of second, i.e., every element
of first, appears in second (assume no duplicates)
def isSubset(L1, L2):
for e1 in L1:
matched = False
for e2 in L2:
if e1 == e2:
matched = True
break
if not matched:
return False
return True
6.0001 LECTURE 10 33
QUADRATIC
COMPLEXITY
def isSubset(L1, L2):
for e1 in L1:
outer loop executed len(L1)
times
matched = False
for e2 in L2: each iteration will execute
i inner loop up to len(L2)
f e1matched
== e2: = True times, with constant number
break of operations
if not matched: O(len(L1)*len(L2))
return False
return True worst case when L1 and L2
same length, none of
elements of L1 in L2
O(len(L1)2)
6.0001 LECTURE 10 34
QUADRATIC
COMPLEXITY
find intersection of two lists, return a list with each element
appearing only once
def intersect(L1, L2):
tmp = []
for e1 in L1:
for e2 in L2:
if e1 == e2:
tmp.append
(e1)
res = []
for e in tmp:
if not(eres.append(e)
in res):
return res
6.0001 LECTURE 10 35
QUADRATIC
COMPLEXITY
def intersect(L1, L2):
tmp = []
first nested loop takes
len(L1)*len(L2) steps
for e1 in L1:
for e2 in L2: second loop takes at
if e1 == e2: most len(L1) steps
tmp.ap
determining if element
pe in list might take len(L1)
nd
(e
steps
1) if we assume lists are of
res = [] roughly same length,
for e in tmp: then
if not(e in res):
O(len(L1)^2)
res.append(e)
return res
6.0001 LECTURE 10 36
O() FOR NESTED LOOPS
def g(n):
""" assume n >= 0 """
x = 0
for i in range(n):
for j in
range(n):
x += 1
return x
O(n2)
THIS TIME AND NEXT
TIME
have seen examples of loops, and nested loops
give rise to linear and quadratic complexity algorithms
next time, will more carefully examine examples from
each of the different complexity classes
6.0001 LECTURE 10 38
MIT OpenCourseWare
https://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: https://ocw.mit.edu/terms.