CH 01 - Algorithm Analysis PDF
CH 01 - Algorithm Analysis PDF
Contents
• Introduction to computing
• Algorithms
• Experimental analysis
• Pseudocode
• Random access machine
• Functions in algorithm analysis
• Asymptotic analysis
• Asymptotic notation
Big-Oh, Big-Omega, Big-Theta, Little-Oh, Little-Omega
• Examples
Search
Prefix averages
Maximum contiguous subsequence sum
• Java and Eclipse
1
Computing: What is computer science?
How do I insert an equation
My computer crashed
How do I fix it? in MS Word?
None of these!
2
Abstract vs. Real Computer
Abstract computer: Real computers:
Turing machine (TM) NASA’s supercomputer My little smartphone
Control
unit:
q
Infinite tape:
... B B 0 0 1 B ...
Language
• A set of strings from an alphabet L = {100, 1000, 1100, 10000,
• TM decides/solves any recursive language (= 10100, … }
algorithm)
Input Output
• Properties
Correctness: must provide the correct output for every input
Performance: Measured in terms of the resources used (time and space)
End: must finish in a finite amount of time
5
Input size
• Performance of an algorithm measured in terms of the input size
• Examples:
A
Number of elements in a list or array A: n 21 22 23 25 24 23 22
0 1 2 3 4 5 6
6
Experimental vs Theoretical analysis
9000
Experimental analysis:
• Write a program that implements the 8000
algorithm 7000
Time (ms)
5000
• Keep track of the CPU time used by the
program on each input size 4000
Limitations: 2000
7
Theoretical analysis – main framework
2. Write
Advantages: algorithm in
pseudocode
• Uses a high-level description of max A[0]
the algorithm instead of an for i 1 to n 1 do
if A[i] max then
implementation max A[i] 3. Count
1. Problem return max primitive
• Characterizes running time as a operations
function of the input size, n. Given array A
Find max of A T(n) = 8n – 3
• Takes into account all possible
inputs
• Allows us to evaluate the speed
of an algorithm independently 4. Find
of the hardware/software 5. Compare asymptotic
with other notation for
environment T(n):
algorithms
O, ,,, ,
Divide and conquer? T(n) is (n)
Recursive? Linear time?
8
Pseudocode
Pseudocode provides a high-level description of an algorithm and avoids to show details that are unnecessary for
the analysis.
9
Random Access Machine (RAM)
10
Most important functions used in Algorithm Analysis
• The following functions often
appear in algorithm analysis: 1.00E+30
N-Log-N n log n
1.00E+22
1.00E+20
n2
Quadratic n2
T(n) = time
1.00E+18
Cubic n3 1.00E+16
Exponential 2n 1.00E+14
1.00E+12
n = size of input
11
Primitive operations
• Basic computations Examples: Counting primitive operations:
performed by an algorithm Evaluating an expression
e.g. 𝑎 − 5 + 𝑐 𝑏 Algorithm arrayMax(A, n)
• Identifiable in pseudocode currentMax A[0]
# operations
2
• Largely independent of the Assigning a value to a for i 1 to n 1 do 2n
variable if A[i] currentMax then 2(n 1)
programming language e.g. 𝑎 ← 23 currentMax A[i] 2(n 1)
• Exact definition not Indexing into an array
{ increment counter i } 2(n 1)
important (we will see why e.g. 𝐴 𝑖
return currentMax 1 _
later) Total: T(n) = 8n 3
Calling a method
• Take a constant amount of
e.g. v.method()
time in the RAM model
(one unit of time or Returning from a method
constant time) e.g. return a
12
Case analysis
140
Experimental analysis Three cases
120
• Worst case:
among all possible inputs, the one which
Running time
100
takes the largest amount of time.
80
• Best case:
60 The input for which the algorithm runs the
fastest
• Average case:
40
13
Asymptotic notation
Name Notation Informal Bound Notes
/use name
Big-Oh Ο(𝑛) order of Upper bound – The most commonly used
tight notation for assessing the
complexity of an algorithm
Big-Theta Θ(𝑛) Upper and lower The most accurate asymptotic
bound – tight notation
Big- Ω(𝑛) Lower bound – Mostly used for determining
Omega tight lower bounds on problems rather
than algorithms (e.g., sorting)
Little-Oh 𝜊(𝑛) Upper bound – Used when it is difficult to obtain
loose a tight upper bound
Little- 𝜔(𝑛) Lower bound – Used when it is difficult to obtain
Omega loose a tight lower bound
14
Big-Oh notation
15
Big-Oh rules - properties
• The statement “f(n) is O(g(n))” means that the growth rate of f(n) is no
more than the growth rate of g(n)
• O(g(n)) is a set or class of functions: it contains all the functions that have
the same growth rate
• If f(n) is a polynomial of degree d, then f(n) is O(nd)
If d = 0, then f(n) is O(1)
Example: 𝑛2 + 3𝑛 − 1 is O(n2)
• We always use the simplest expression of the class/set
E.g., we state 2n + 3 is O(n) instead of O(4n) or O(3n+1)
• We always use the smallest possible class/set of functions
E.g., we state 2n is O(n) instead of O(n2) or O(n3)
• Linearity of asymptotic notation
O(f(n)) + O(g(n)) = O(f(n) + g(n)) = O(max{f(n),g(n)}), where “max” is wrt “growth rate”
Example: O(n) + O(n2) = O(n + n2) = O(n2)
16
Big-Omega and Big-Theta notations
• big-Omega
f(n) is (g(n)) if there is a constant c > 0 and an integer constant n0 1
such that f(n) c g(n) for n n0
Example: 3n3 – 2n + 1 is (n3)
• big-Theta
f(n) is (g(n)) if there are constants c’ > 0 and c’’ > 0 and an integer constant
n0 1 such that c’ g(n) f(n) c’’ g(n) for n n0
Example: 5n log n – 2n is (n log n)
Important axiom:
f(n) is O(g(n)) and (g(n)) f(n) is (g(n)
Example: 5n2 is O(n2) and (n2) 5n2 is (n2)
17
Asymptotic notation – graphical comparison
Big-Oh
Normal scale:
f(n) is O(g(n)) if f(n) is
5000 Log-log scale:
f(n) = n2
asymptotically less than 4500 c" g(n) = 2n2
c` g(n) = 0.5n2 1.00E+21
or equal to g(n) 4000
n^2
1.00E+19 3n^2
Big-Omega 3500
1.00E+17
0.5n^2
Big-Theta 2000
1.00E+09
1.00E+01
500
1.00E-01
1.00E+00 1.00E+02 1.00E+04 1.00E+06 1.00E+08 1.00E+10
0
0 5 10 15 20 25 30 35 40 45 50
18
Little-Oh and Little-Omega notations
• Little-Oh
f(n) is o(g(n)) if for any constant c > 0, there is a constant n0 > 0
such that f(n) < c g(n) for n n0
Example: 3n2 – 2n + 1 is o(n3), while 3n2 – 2n + 1 is not o(n2)
• Little-Omega
f(n) is (g(n)) if for any constant c > 0, there is a constant n0 > 0 such that
f(n) > c g(n) for n n0
Example: 3n2 – 2n + 1 is (n), while 3n2 – 2n + 1 is not (n2)
Important axiom:
f(n) is o(g(n)) g(n) is (f(n))
Comparison with O and
For O and , the inequality holds if there exists a constant c > 0
For o and , the inequality holds for all constants c > 0
19
Case study 1: Search in a Map (unsorted list)
Algorithm linearSearch(S, k, n):
• Problem: Given an unsorted array
Input: Sorted array S of size n, and key k
S of integers (a map), find a key k
in that map. Output: Null or the element found
i0
• One of the most important
problems in computer science while i < n and S[i]!= k
ii+1
if i = n then
• Solution 1: Linear search
return null
Scan the elements in the list one by
one else
Until the key k is found return e
• Example:
S[i] 33 12 19 2 23 11 41 15 Worst-case running time: T(n) = 3n + 3 T(n) is
i 0 1 2 3 4 5 6 7 O(n)
• Linear search runs in linear time.
20
Case study 1: Search in a Map (sorted list)
Algorithm binarySearch(S, k, low, high):
• Problem: Given a sorted Input: A key k
array of integers (a map), Output: Null or the element found
find a key k in that map. if low > high then
return null
• Solution 2: Binary search else
mid (low + high) / 2
• Binary search runs in
e S[mid]
logarithmic time if k = e.getKey() then
• Same problem: return e
else if k < e.getKey() then
Two algorithms run in
return binarySearch(S, k, low, mid-1)
different times
else
return binarySearch(S, k, mid+1, high)
21
Case study 2: Prefix averages
• The i-th prefix average of an array Algorithm quadPrefixAve(S, n)
S is the average of the first (i + 1) Input: array S of n integers
elements of S: Output: array A of prefix averages of S
A[i] (S[0] + S[1] + … + S[i])/(i+1) #operations
A new array of n integers n
• Problem: Compute the array A of
prefix averages of another array S for i 0 to n - 1 do n
s S[0] n–1
• Has applications in financial
analysis for j 1 to i do 1 + 2 + …+ (n - 1) 𝑛(𝑛 − 1)
s s + S[j] 1 + 2 + …+ (n - 1) 2
• Solution 1: A quadratic-time
algorithm: quadPrefixAve A[i] s / (i + 1) n-1
return A 1
• Example:
S 21 23 25 31 20 18 16
0 1 2 3 4 5 6
T2(n) = 2n + 2(n-1) + 2n(n-1)/2 + 1 is O(n2)
A 21 22 23 25 24 23 22
0 1 2 3 4 5 6
22
Case study 2: Prefix averages
• Solution 2: A linear-time Algorithm linearPrefixAve(S, n)
algorithm: linearPrefixAve Input: array S of n integers
• For each element being Output: array A of prefix averages of S
#operations
scanned, keep the running sum
A new array of n integers n
S 21 23 25 31 20 18 16 s0 1
0 1 2 3 4 5 6 for i 0 to n - 1 do n
A 21 22 23 25 24 23 22 s s + S[i] n–1
35
0 1 2 3 4 5 6
A[i] s / (i + 1) n–1
30
return A 1
S A
25
20 T2(n) = 4n is O(n)
15
10
0
0 1 2 3 4 5 6
23
Case study 3: Maximum contiguous subsequence sum (MCSS)
• Problem:
Given: a sequence of integers (possibly negative) 𝐴 = 𝐴1 , 𝐴2 , … , 𝐴𝑛
𝑗
Find: the maximum value of σ𝑘=𝑖 𝐴𝑘
If all integers are negative the MCSS is 0
• Example:
For A = -3, 10, -2, 11, -5, -2, 3 the MCSS is 19
For A = -7, -10, -1, -3 the MCSS is 0
For A = 12, -5, -6, -4, 3 the MCSS is 12
• Various algorithms solve the same problem
Cubic time
Quadratic time
Divide and conquer
Linear time
24
MCSS: Cubic vs quadratic time algorithms
Algorithm cubicMCSS(A,n) Algorithm quadraticMCSS(A,n)
Input: A sequence of integers A of length n Input: A sequence of integers A of length n
Output: The value of the MCSS Output: The value of the MCSS
maxS 0 maxS 0
for i 0 to n – 1 do for i 0 to n – 1 do
for j i to n – 1 do for j i to n – 1 do
curS 0 curS curS + A[j] 𝑛(𝑛 − 1)
∝
for k i to j do 𝑛−1 𝑛−1 𝑗 2
1 if curS > maxS
curS A[k] 𝑖=0 𝑗=𝑖 𝑘=𝑖
3 2 maxS curS
if curS > maxS 𝑛 + 3𝑛 + 2𝑛
= return maxS
6
maxS curS
return maxS The double sum will give 𝑂(𝑛2 )
𝑛3 +3𝑛2 +2𝑛
T(n) = + c is 𝑂(𝑛3 )
6
Example: for A = -3, 10, -2, 11, -5, -2, 3 the MCSS is 19
25
MCSS: Divide and conquer
• Main features: First half Second half
Rather lengthy
Split the sequence into two A = -3, 10, -2, 11, -1, 2, -3
26
Linear time algorithm
Algorithm linearMCSS(A,n)
• Tricky parts of this algorithm
Input: A sequence of integers A of length n
are:
Output: The value of the MCSS
No MCSS will start or end with
a negative number maxS 0; curS 0
We only find the value of the for j 0 to n – 1 do
MCSS curS curS + A[j]
But if we need the actual if curS > maxS
subsequence, we’ll need to maxS curS
resort on at least divide and else
conquer
if curS < 0
curS 0
Example: for A = -3, 10, -2, 11, -5, -2, 3 the MCSS is 19 return maxS
The single for loop gives 𝑂(𝑛)
27
Example: Best vs worst case
Loops: Worst Case: take maximum
Best Case: take minimum
worst best
i0 1 1
while i < n and A[i] != 7 n 1
ii+1 n 0
O(n) O(1)
Worst-case input: 3 1 4 2 3 2 1 8
0 1 2 3 4 5 6 7 n
Best-case input: 7 1 5 4 8 2 1 9
0 1 2 3 4 5 6 7 n
28
Our programming language: Java
According to Sun’s Java developers:
“Java is a simple, object-oriented, distributed, secure, Java SE Conceptual Diagram
architecture, robust, multi threaded and dynamic
language. The program can be written once, and run
anywhere”
• It runs on the Java Virtual Machine (JVM)
Main features of Java 8
• Object is the main data type, and can be
as general as we wanted:
public Object read()
Public void write(Object x)
• Classes and Interfaces can be generic too:
public class MyClass<AnyType> {
public AnyType read() {…}
….
• Static methods can be generic too
Public static <AnyType> boolean find(AnyType [] a,
AnyType x) { … }
• Generic classes/objects have some restrictions
(primitive types, and others)
29
Java – some facts*
• 10 Million Java Developers
Worldwide
• #1 Choice for Developers
• #1 development platform in the
cloud
• 5 million students study Java
• 2nd most important programming
language for IEEE Spectrum
30
Java – more facts
Java installation window Android apps are written in Java
31
Java – even more…
• Java ranked the among the top programming languages of 2017 by IEEE Spectrum
http://spectrum.ieee.org/static/interactive-the-top-programming-languages-2017
32
Java SE and EE 8
• Our implementations will use Java Standard Edition (Java SE 8)
http://www.oracle.com/technetwork/java/javase/overview/index.html
• A more advanced edition is the Java Enterprise Edition (Java EE 8)
http://www.oracle.com/technetwork/java/javaee/overview/index.html
• Java EE is developed using the Java Community Process
• Includes contributions from experts (industry, commercial, organizations,
etc.)
• Releases and new features are aligned with contributors
• Main features:
a rich platform, widely used,
scalable, low risk, etc.
enhances HTML5 support and
increase developer productivity
Java EE 8 developers have support
for latest Web applications and
frameworks
*Image from www.oracle.com
33
Eclipse
• Main features
IDE and Tools
Desktop IDEs: Supports Java Integrated
Development Environment (IDE), PHP, Java SE/EE,
C/C++
Web IDEs: Software placed in the cloud, and
accessed from anywhere (desktop, laptop or tablet)
Community of Projects
Can participate and contribute to other projects
Collaborative Working Groups
It’s an open industry collaboration used to develop
new industry platforms
Current groups include automotive, location tech,
science, long-term support, Internet of Things (IoT),
PolarSys (embedded systems)
34
Eclipse – main platform
• Eclipse platform is structured as
a set of components
implemented through plug-ins
• Components are built on top of
a small runtime engine
• The Workbench is the desktop
development environment
• It’s an integration tool for
creation, management, and
navigation of workspace
resources
35
Review and Further Reading
• Math and proofs:
Arithmetic and geometric series
Summations
Logarithms and properties
Proofs and justifications
Basic probability
Section 1.2 of [3] and Sections 1.1, 1.2, 1.3, 1.4 and 1.6 of [1]
Appendices A and C of [5].
• Algorithm analysis:
Ch. 4 of [2], Ch. 2 of [4], Ch. 3 of [5]
• Java
Main sites by Oracle: [7],[8],[9]
• Eclipse
Main site: [6]. Basic tutorial: Eclipse > Workbench User Guide > Getting started
36
References
1. Algorithm Design and Applications by M. Goodrich and R. Tamassia, Wiley,
2015.
2. Data Structures and Algorithms in Java, 6th Edition, by M. Goodrich and R.
Tamassia, Wiley, 2014. (On reserve in the Leddy Library)
3. Data Structures and Algorithm Analysis in Java, 3rd Edition, by M. Weiss,
Addison-Wesley, 2012.
4. Algorithm Design by J. Kleinberg and E. Tardos, Addison-Wesley, 2006.
5. Introduction to Algorithms, 2nd Edition, by T. Cormen et al., McGraw-Hill, 2001.
6. The Eclipse Foundation, http://www.eclipse.org/
7. Java by Oracle: http://www.oracle.com/technetwork/java/index.html
8. Java Standard Edition (Java SE 8)
http://www.oracle.com/technetwork/java/javaee/overview/index.html
9. Java Enterprise Edition (Java EE 8)
http://www.oracle.com/technetwork/java/javaee/overview/index.html
37
Lab – Practice
1. Use the class MaxSumTest.java provided in the source code.
a) Create a new Java project in Eclipse (called “Algorithm Analysis”)
b) Create a new package called “analysis”
c) Create a new class called MaxSumTest and enter the provided source code.
d) Run the four algorithms with the example provided in the source code.
e) Create random sequences of n integers between –n/2 and +3n/2
f) Run the program of 1.e for n = 2i , where i = 3, 4, …, 16, and record the CPU
time for each algorithm and each value of i
g) Create a table with the times obtained and discuss these. Compare the CPU
times with the complexity (running time) of the algorithms
h) Plot the CPU times obtained for each algorithm and compare the plots with
those of the most important functions used in algorithm analysis (slide 11)
2. *Repeat all items of #1 for the Prefix Averages problem.
a) In this case, you will have to implement the two algorithms (as they are not
provided).
38
Exercises
1. Sort the functions 12n2, 3n, 0.5 log n, n log n, 2n3 in increasing 8. Give five different reasons for why we use Java in this course.
order of growth rate.
9. Describe the main features of the Eclipse IDE platform.
2. Algorithm A uses 20 n log n operations, while algorithm B uses 2n2
operations. Assume all operations take the same time. What is the 10. What is the advantage of using Java Enterprise Edition?
value of n0 for which A will run faster? Which algorithm will you 11. Consider A = 3, 4, -7, 3, 6, -3, 2, 8, -1. Can the algorithm
use if your inputs are of size 10,000? linearMCCS be modified to run a little bit faster on this example?
3. *Prove or disprove that (a) f(n) = 2n2 + 3 is O(n), (b) f(n) is O(n10), Yes/no? why not?
(c) f(n) is (1), (d) f(n) is (1), (e) f(n) is (n0.5), (f) f(n) (n2), (e) 12. Give an example of an algorithm whose best and worst case
f(n) is o(n9/4). running times differ in more than a constant (i.e., asymptotically
4. Consider A = 3, 4, -7, 3, 6, -3, 2, 8, -1. Find the MCCS using the four different).
algorithms discussed in class. *How many operations will the 13. *Implement linear search and binary search on an array A. Run a
linear-time algorithm use? variety experiments on different lists of different sizes. Which
5. Find the worst-case running time in O-notation for algorithms algorithm will you use and for which sizes?
linearMCCS and quadraticMCCS. Show all steps used in the 14. List the main methods of proofs used for asymptotic notation.
calculations. What about and ? Also, and o?
15. List the main properties of the rules for O-notation. Do these apply
6. *Implement linearMCCS and divide-and-conquer-MCCS. Run them to and ? How?
on several random sequences of size 1,000. Which one is faster?
What are the inputs for which the algorithms run faster/slower? 16. What is the difference between problem and algorithm? Explain
Take the average CPU times and compare the with the running how a problem can be solved using different algorithms and how
times of the algorithms. they may have different complexities. Give an example.
7. Are these statements true? (a) If f(n) is O(g(n)), then f(n) is (g(n)); 17. Give examples of functions for the different types of asymptotic
(b) if f(n) is O(g(n)), then g(n) is O(g(n)); (c) if f(n) is O(g(n)) and f(n) notation.
is (g(n)), then f(n) is (g(n)); (d) if f(n) is O(g(n)) and g(n) is
(f(n)), then f(n) is (g(n)); (e) if f(n) is o(g(n)), then f(n) is O(g(n));
(f) if f(n) is (g(n)), then f(n) is (g(n)).
39