Lecture 10

UNDERSTAND
ING
PROGRAM
EFFICIENCY: 1
(download slides and .py files and follow
along!) 6.0001 LECTURE 10
6.0001 LECTURE 10 1
Today
 Measuring orders of growth of algorithms
 Big “Oh” notation
 Complexity classes
6.0001 LECTURE 10 2
WANT TO UNDERSTAND
EFFICIENCY OF
PROGRAMS
computers are fast and getting faster – so maybe efficient
programs don’t matter?
◦but data sets can be very large (e.g., in 2014, Google served
30,000,000,000,000 pages, covering 100,000,000 GB – how long to
search brute force?)
◦thus, simple solutions may simply not scale with size in acceptable
manner
 how can we decide which option for program is most
efficient?
 separate time and space efficiency of a program

◦ can sometimes
tradeoff betweenpre-compute
them: results are stored; then use “lookup” to
retrieve (e.g., memoization for Fibonacci)
◦will focus on time efficiency
6.0001 LECTURE 10 3
WANT TO UNDERSTAND
EFFICIENCY OF
PROGRAMS
Challenges in understanding efficiency of solution to a
computational problem:
a program can be implemented in many different
ways
you can solve a problem using only a handful of
different algorithms
would like to separate choices of implementation
from choices of more abstract algorithm
6.0001 LECTURE 10 4
HOW TO EVALUATE
EFFICIENCY OF
PROGRAMS
measure with a timer
 count the operations
 abstract notion of order of growth
6.0001 LECTURE 10 5
TIMING A PROGRAM
 use time module
recall that import time
importing means to
bring in that class def c_to_f(c):
into your own file return c*9/5 + 32
 start clock t0 = time.clock()

 call function c_to_f(100000)
t1 = time.clock() - t0
 stop clock Print("t =", t, ":", t1, "s,”)
6.0001 LECTURE 10 6
TIMING PROGRAMS IS
INCONSISTENT
 GOAL: to evaluate different algorithms
 running time varies between algorithms
 running time varies between implementations
 running time varies between computers
running time is not predictable based on small
inputs
 time varies for different inputs but

cannot really express a relationship
between inputs and time
6.0001 LECTURE 10 7
COUNTING OPERATIONS
 assume these steps take def c_to_f(c):
constant time: return c*9.0/5 + 32
• mathematical operations def mysum(x):
• comparisons total = 0
• assignments for i in range(x+1):
total += i
• accessing objects in memory return total
•then count the number of
operations executed as mysum  1+3x ops
function of size of input
6.0001 LECTURE 10 8
COUNTING OPERATIONS
IS BETTER, BUT STILL…
 GOAL: to evaluate different algorithms
 count depends on algorithm
 count depends on implementations
 count independent of computers
 no clear definition of which operations to count
 count varies for different inputs and

can come up with a relationship
between inputs and the count
6.0001 LECTURE 10 9
STILL NEED A BETTER
•WAY
timing and counting evaluate implementations
• timing evaluates machines
• want to evaluate algorithm

• want to evaluate scalability
• want to evaluate in terms of input size
6.0001 LECTURE 10 10
STILL NEED A BETTER
WAY
Going to focus on idea of counting operations in an
algorithm, but not worry about small variations in
implementation (e.g., whether we take 3 or 4 primitive
operations to execute the steps of a loop)
Going to focus on how algorithm performs when size
of problem gets arbitrarily large
Want to relate time needed to complete a
computation, measured this way, against the size of
the input to the problem
Need to decide what to measure, given that actual
number of steps may depend on specifics of trial
6.0001 LECTURE 10 11
NEED TO CHOOSE WHICH
INPUT TO USE TO EVALUATE
A FUNCTION
want to express efficiency in terms of size of input, so
need to decide what your input is
 could be an integer
-- mysum(x)
 could be length of list
-- list_sum(L)
 you decide when multiple parameters to a function
-- search_for_elmt(L, e)
6.0001 LECTURE 10 12
DIFFERENT INPUTS
CHANGE HOW THE
PROGRAM RUNS
a function that searches for an element in a list
def search_for_elmt(L, e):
for i in L:
if i == e:
return True
return False
 when e is first element
in the list  BEST CASE
 when e is not in list 
WORST CASE
when look through about half of the elements in
list  AVERAGE CASE
6.0001 LECTURE 10 13
 want to measure this behavior in a general way
BEST, AVERAGE, WORST
CASES
 suppose you are given a list L of some length len(L)
best case: minimum running time over all possible inputs
of a given size, len(L)
• constant for search_for_elmt
• first element in any list
average case: average running time over all possible inputs
• practical measure
worst case: maximum running time over all possible inputs
• linear in length of list for search_for_elmt
• must search entire list and not find it
6.0001 LECTURE 10 14
ORDERS OF GROWTH
Goals:
 want to evaluate program’s efficiency when input is very big
want to express the growth of program’s run time as input
size grows
 want to put an upper bound on growth – as tight as
possible
 do not need to be precise: “order of” not “exact” growth
we will look at largest factors in run time (which section of
the program will take the longest to run?)
thus, generally we want tight upper bound on growth, as
function of size of input, in worst case
6.0001 LECTURE 10 15
MEASURING ORDER OF
GROWTH: BIG OH
NOTATION
Big Oh notation measures an upper bound on the
asymptotic growth, often called order of growth
 Big Oh or O() is used to describe worst case

• worst case occurs often and is the bottleneck
when a program runs
• express rate of growth of program relative to the
input size
• evaluate algorithm NOT machine or implementation
6.0001 LECTURE 10 16
EXACT STEPS vs O()
def fact_iter(n):
"""assumes n an int >= 0"""
answer = 1
while n > 1:
answer *= n
n -= 1
return answer
 computes factorial
 number of steps:
 worst case asymptotic complexity:
• ignore additive constants
• ignore multiplicative constants
6.0001 LECTURE 10 17
WHAT DOES O(N)
MEASURE?
Interested in describing how amount of time needed
grows as size of (input to) problem grows
Thus, given an expression for the number of
operations needed to compute an algorithm, want to
know asymptotic behavior as size of problem gets large
Hence, will focus on term that grows most rapidly in a
sum of terms
And will ignore multiplicative constants, since want to
know how rapidly time required increases as increase
size of input
6.0001 LECTURE 10 18
SIMPLIFICATION
EXAMPLES
drop constants and multiplicative factors
 focus on dominant terms
O(n )
2 : n2 + 2n + 2
O(n )
2 : n2 + 100000n +
O(n)
31000
n) : log(n) + n + 4
O(n log
O(3 )
n : 0.0001*n*log(n) + 300n
: 2n30 + 3n
6.0001 LECTURE 10 19
TYPES OF ORDERS OF
GROWTH
6.0001 LECTURE 10 20
ANALYZING PROGRAMS
AND THEIR COMPLEXITY
 combine complexity classes
• analyze statements inside functions
• apply some rules, focus on dominant term
Law of Addition for O():
• used with sequential statements
• O(f(n)) + O(g(n)) is O( f(n) + g(n) )
• for example,
for i in
range(n):
print('a')
for j in
range(n*n):
print('b') 6.0001 LECTURE 10 21
is O(n) + O(n*n) =
ANALYZING PROGRAMS
AND THEIR COMPLEXITY
 combine complexity classes
• analyze statements inside functions
• apply some rules, focus on dominant term
Law of Multiplication for O():
• used with nested statements/loops
• O(f(n)) * O(g(n)) is O( f(n) * g(n) )
• for example,
for i in range(n):
for j in
range(n):
print('a')
is O(n)*O(n) = O(n*n) = O(n2) because the outer loop goes n
times and the inner loop goes n times for every outer loop22iter.
6.0001 LECTURE 10
COMPLEXITY CLASSES
 O(1) denotes constant running time
 O(log n) denotes logarithmic running time
 O(n) denotes linear running time
 O(n log n) denotes log-linear running time
O(nc) denotes polynomial running time (c is
a constant)
O(cn) denotes exponential running time (c is a
constant being raised to a power based on size of
input)
6.0001 LECTURE 10 23
COMPLEXITY CLASSES
ORDERED LOW TO HIGH
O(1) : constant
O(log n) : logarithmic
O(n) linear
:
loglinear
O(n log n):
polynomial
O(nc)
: exponential
6.0001 LECTURE 10 24
O(c )
n
:
COMPLEXITY GROWTH
CLASS n=10 = 100 = 1000 = 1000000
O(1) 1 1 1 1
O(log n) 1 2 3 6
O(n) 10 100 1000 1000000
O(n log n) 10 200 3000 6000000
O(n^2) 100 10000 1000000 1000000000000
O(2^n) 1024 12676506 1071508607186267320948425049060 Good luck!!

00228229 0018105614048117055336074437503
40149670 8837035105112493612249319837881
3205376 5695858127594672917553146825187
1452856923140435984577574698574
8039345677748242309854210746050
6237114187795418215304647498358
1941267398767559165543946077062
9145711964776865421676604298316
52624386837205668069376
6.0001 LECTURE 10 25
LINEAR COMPLEXITY
Simple iterative loop algorithms are typically linear in
complexity
6.0001 LECTURE 10 26
LINEAR SEARCH
ON UNSORTED LIST
def linear_search(L, e):
found = False
for i in
if e == L[i]:
range(len(L)):
found = True
return found
 must look through all elements to decide it’s not there

 O(len(L)) for the loop * O(1) to test if e == L[i]
◦O(1 + 4n + 1) = O(4n + 2) = O(n)
 overall complexity is O(n) – where n is len(L)
6.0001 LECTURE 10 27
CONSTANT TIME LIST
ACCESS
if list is all ints
…
◦ i element at
th
◦base + 4*i
if list is
heterogeneou
s◦references to other objects
◦ indirection
…
6.0001 LECTURE 10 28
LINEAR SEARCH
ON SORTED LIST
def search(L, e):
for i in range(len(L)):
if L[i] == e:
return True
if L[i] > e:
return
return False
False
 must only look until reach a number greater than e
 O(len(L)) for the loop * O(1) to test if e == L[i]
 overall complexity is O(n) – where n is len(L)
NOTE: order of growth is same, though run time may
differ for two search methods
6.0001 LECTURE 10 29
LINEAR COMPLEXITY
 searching a list in sequence to see if an element is present
add characters of a string, assumed to be composed of
decimal digits
def addDigits(s):
val = 0
for c in s:
val += int(c)
return val
 O(len(s))
6.0001 LECTURE 10 30
LINEAR
COMPLEXITY
 complexity often depends on number of iterations
def fact_iter(n):
prod = 1
for i in range(1, n+1):
prod *= i
return prod
 number of times around loop is n
number of operations inside loop is a constant (in this case, 3 –
set i, multiply, set prod)
◦O(1 + 3n + 1) = O(3n + 2) = O(n)
 overall just O(n)
6.0001 LECTURE 10 31
NESTED LOOPS
 simple loops are linear in complexity
 what about loops that have loops within them?
6.0001 LECTURE 10 32
QUADRATIC COMPLEXITY
determine if one list is subset of second, i.e., every element
of first, appears in second (assume no duplicates)
def isSubset(L1, L2):
for e1 in L1:
matched = False
for e2 in L2:
if e1 == e2:
matched = True
break
if not matched:
return False
return True
6.0001 LECTURE 10 33
QUADRATIC
COMPLEXITY
def isSubset(L1, L2):
for e1 in L1:
outer loop executed len(L1)
times
matched = False
for e2 in L2: each iteration will execute
i inner loop up to len(L2)
f e1matched
== e2: = True times, with constant number
break of operations
if not matched: O(len(L1)*len(L2))
return False
return True worst case when L1 and L2
same length, none of
elements of L1 in L2
O(len(L1)2)
6.0001 LECTURE 10 34
QUADRATIC
COMPLEXITY
find intersection of two lists, return a list with each element
appearing only once
def intersect(L1, L2):
tmp = []
for e1 in L1:
for e2 in L2:
if e1 == e2:
tmp.append
(e1)
res = []
for e in tmp:
if not(eres.append(e)
in res):
return res
6.0001 LECTURE 10 35
QUADRATIC
COMPLEXITY
def intersect(L1, L2):
tmp = []
first nested loop takes
len(L1)*len(L2) steps
for e1 in L1:
for e2 in L2: second loop takes at
if e1 == e2: most len(L1) steps
tmp.ap
determining if element
pe in list might take len(L1)
nd
(e
steps
1) if we assume lists are of
res = [] roughly same length,
for e in tmp: then
if not(e in res):
O(len(L1)^2)
res.append(e)
return res
6.0001 LECTURE 10 36
O() FOR NESTED LOOPS
def g(n):
""" assume n >= 0 """
x = 0
for i in range(n):
for j in
range(n):
x += 1
return x
 computes n2 very inefficiently

 when dealing with nested
loops, look at the ranges
 nested loops, each iterating n
times 6.0001 LECTURE 10 37
 O(n2)
THIS TIME AND NEXT
TIME
have seen examples of loops, and nested loops
 give rise to linear and quadratic complexity algorithms
next time, will more carefully examine examples from
each of the different complexity classes
6.0001 LECTURE 10 38
MIT OpenCourseWare
https://ocw.mit.edu
6.0001 Introduction to Computer Science and Programming in Python

Fall 2016
For information about citing these materials or our Terms of Use, visit: https://ocw.mit.edu/terms.

Lecture 10

Uploaded by

Copyright:

Available Formats

Lecture 10

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 10

Uploaded by

Copyright:

Available Formats

UNDERSTAND

 separate time and space efficiency of a program

 start clock t0 = time.clock()

 time varies for diﬀerent inputs but

 count varies for diﬀerent inputs and

• want to evaluate algorithm

 Big Oh or O() is used to describe worst case

O(n) 10 100 1000 1000000

O(n log n) 10 200 3000 6000000

O(n^2) 100 10000 1000000 1000000000000

O(2^n) 1024 12676506 1071508607186267320948425049060 Good luck!!

 must look through all elements to decide it’s not there

 computes n2 very inefficiently

6.0001 Introduction to Computer Science and Programming in Python

You might also like