Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
3 views

DataStructure_3_complexity

Uploaded by

reifuyuki1119
Copyright
© © All Rights Reserved
Available Formats
Download as KEY, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

DataStructure_3_complexity

Uploaded by

reifuyuki1119
Copyright
© © All Rights Reserved
Available Formats
Download as KEY, PDF, TXT or read online on Scribd
You are on page 1/ 81

CHAPTER 1

BASIC CONCEPT
How to create programs
. Requirements
. Analysis: bottom-up vs. top-down
. Design: data objects and
operations
. Refinement and Coding
. Verification
. Program Proving
. Testing
. Debugging
Data Type
. Data Type
. Objects Operations
.

. Abstract Data Type


. An abstract data type(ADT) is a data
type that is organized in such a way
. The specification of the objects and
the operations on the objects
. The representation of the objects
and the implementation of the
operations
Specification vs.
Implementation
. Operation specification
. function name
. the types of arguments
. the type of the results
. Implementation independent
Describe Ferris
Wheel?
Specification:
multiple passenger
cars
attached to the
rim
Operations:
rotate upright
Describe Natural
Number?
Specification:
An ordered subrange of the integers
starting at zero and ending at the
maximum integer
Operations:
rotate upright
*Structure 1.1:Abstract data type
Natural_Number (p.17)structure
Natural_Number is objects: an ordered subrange of the
integers starting at zero and ending at the maximum
integer (INT_MAX) on the computer functions: for all x, y
∈ Nat_No; TRUE, FALSE∈Boolean Add, Sub, Succ, and Eq
are the usual integer operations. Nat_No Zero ( )
::= 0 Boolean Is_Zero(x) ::= if (x) return FALSE
else return TRUE Nat_No Add(x, y)
::= if ((x+y) <= INT_MAX) return x+y
else return INT_MAX Boolean Eq(x,y)
::= if (x== y) return TRUE
else return FALSE Nat_No Succ(x) ::= if (x ==
INT_MAX) return x else
return x+1 Nat_No Sub(x,y) ::= if (x<y) return 0
else return x-y

::= is
*Structure 1.1:Abstract data type
Natural_Number (p.17) axioms: (describe
relations) Is_Zero(Zero()) ::= TRUE
Is_Zero(Succ (x)) ::= FALSE Add(Zero(), y)
::= y Add(Succ(x),y) ::= Successor(Add(x,y))
Eq(x,Zero()) ::= Iz_Zero(x) Eq(Zero(),Succ(y))
::= FALSE Eq(Succ(x),Succ(y)) ::=Equal(x,y)
Sub(x,Zero()) ::=x
Sub(Zero,Succ(y)) ::=Zero()
Sub(Succ(x),Succ(y)) ::=Sub(x,y)end Natural_Number
Truly Understand

. You can clearly explain it by


your own words!
What is Algorithm?
. An algorithm is
. any well-defined computation
procedure
. takes some value, or set of values, as
Input
. produces some value, or set of values,
as Output
.

.
A tool for solving well-specific
computational problem
Algorithm
. DefinitionAn algorithm is a finite
set of instructions that
accomplishes a particular task
. Criteria
. input
. output
. definiteness: clear and unambiguous
. finiteness: terminate after a finite
number of steps
. effectiveness: instruction is basic
enough to be carried out
Algorithm Definitionwith 5
Conditions
. Input≧0
. Output > 0
. Definiteness, Unambiguity
.
Y:= X /
Each operation is clearly defined
0
. Finiteness, Termination
. Finally, the algorithm must halt
(停)
. 不會產生無窮迴路
. Effectiveness
. Each operation is
. basic and achievable (step by
step)
Algorithm v.s. Procedure

. The most difference is

“Whether it can
halt?”
Data Structure
. DS is a way to store and
organize data in order to
facilitate access and
modifications
. No single DS works well for
all purpose
. It’s important to know the
strengths and limitation
Algorithm v.s. Data Structure

. 一體兩面
. Program=
AL + DS
Volkswagen
Nissan
Problem & Algorithm
. Instance of a problem
. consists of the inputs needed to
compute a solution to the problem
. Correctness of a algorithm
. If for every input instance, it halts
with the correct output
. A correct algorithm solves the
computational
given
problem
What kind of problem can be
solved by algorithm?

. The Human Genome Project


. The Internet Applications
. Electronic Commerce with
Public-key cryptography and
digital signatures
. Manufacturing and other
commercial settings
計算問題的世紀大難題 P=NP?

US$ 16.99 / US$ 20.95 /


一件 一件
Analyzing Algorithm
Pseudocode
. Typically, the algorithm is written
in a pseudocode
. Similar in many respects to C,
PASCAL, or Java…
. The point is whether the
most
expressive method is
clear and concise to
specify a given algorithm
Insertion sort
. Example: Sorting problem
. Input: A sequence of n numbers < a1 , a2 ,..., an >
. Output: A permutation
< a1' , a2' ,..., a'n >
of the input sequence such that
ʹ ʹ ʹ
.
a1 ≤ a2 ≤ ... ≤ an
. ★The number that we wish to
sort are known as the keys.
An Example: Insertion Sort
i j
1 2 3 4 5 6
5 2 4 6 1 3
key
i j
1 2 3 4 5 6
5 2 4 6 1 3
key
i j
1 2 3 4 5 6
2 5 4 6 1 3
key
i j
1 2 3 4 5 6
2 4 5 6 1 3
key
i j
1 2 3 4 5 6
2 4 5 6 1 3
key
i j
1 2 3 4 5 6
1 2 4 5 6 3
1 2 3 4 5 6
1 2 3 4 5 6
Observation
. Sorted in place :
. The numbers are rearranged within
the array A, with at most a constant
number of them sorted outside the
array at any time.
. Loop invariant :
. At the start of each iteration of the
“while loop”, the subarray A[1]…[j-1]
consists of the elements originally in
A[1]…[j-1] but in sorted order.
Analyzing algorithms
. Come to mean predicting the
resources that the algorithm
requires
.

. Resources:
. memory, time , bandwidth, logic
gate
.

. Assumption: one processor, RAM


model
Performance(1/2)
. How does algorithm behave when
the problem size gets very large
n?
. Running time (Time Complexity)
. Memory/storage requirements (Space
Complexity)
. Only additional storage
requirement is considered.
Performance(2/2)
. Remember that we use the RAM
model:
. All memory equally expensive to access
. ( 記憶體一樣貴 )
. No concurrent operations
. ( 無平行運算 )
. All reasonable instructions take unit
time
. ( 相同執行時間 )
. Constant word size
. ( 固定單位 )
. Unless we are explicitly manipulating
bits
Running Time
. On a particular input, number of
primitive steps that are executed
. Except for time of executing a function
call most statements roughly require
the same amount of time
. It is convenient to define the notion of
step so that it is machine-independent
★Best case, Worst case and Average
case
Insertion Sort
What is the
precondition
for this loop?
Insertion Sort

How many times


will this loop
execute?
Insertion Sort

c1 n
c2 (n-1)
c4 (n-1)
c5 Σ2..n tj

c6 Σ2..n (tj-1)
c7 Σ2..n (tj-1)

c8 (n-1)

***tj is number of times the while loop executed for


the value j
Best-case
★ 不要進入 while 迴圈

n
T (n) = c1n + c2 (n − 1) + c4 (n − 1) + c5 ∑ t j +
j =2
n n
c6 ∑ (t j − 1) + c7 ∑ (t j − 1) + c8 (n − 1)
j =2 j =2

. tj =1 for j = 2,3,…,n
Best-case

T(n) = c1 n + c 2 (n − 1) + c4 (n − 1) + c5 (n − 1) + c8 (n − 1)
= (c1 + c 2 + c4 + c5 + c8 )n − (c2 + c4 + c5 + c8 )

. Linear function
on n
. (n)
Worst-case
★ 每次都進入 while 迴圈
n
T (n) = c1n + c2 (n − 1) + c4 (n − 1) + c5 ∑ t j +
j =2
n n
c6 ∑ (t j − 1) + c7 ∑ (t j − 1) + c8 (n − 1)
j =2 j =2

t j = j for j = 2,3,…,n
.

.
Worst-case

n( n + 1)
T ( n) = c1n + c2 ( n − 1) + c4 ( n − 1) + c5 ( − 1) +
2
n( n − 1) n( n − 1)
c6 ( ) + c7 ( ) + c8 ( n − 1)
2 2
c5 + c6 + c7 2 c5 − c6 − c7
=( ) n − (c1 + c2 + c4 + + c8 ) n
2 2
− (c2 + c4 + c5 + c8 )

. quadratic function
on n
. (n2)
Worst-case v.s. Average-
caseAnalysis
. Usually, we concentrate on
finding only on the Worst-case
running time
. Reason:
.
It is an upper bound on the
running time
.
The worst caseoccurs fair often
.
The average case is often as bad
as the worst case.
. For example, the insertion sort. Again,
quadratic function.
Average Case?

Apply Invert Table


5 2 4 6 1 3

Invert table 0 1 1 0 4 3
1 2 3 4 5 6 1 2 3 4 5 6
5 2 4 6 1 3 2 4 5 6 1 3

1 2 3 4 5 6 1 2 3 4 5 6
2 5 4 6 1 3 2 4 5 6 1 3

1 2 3 4 5 6
1 2 4 5 6 3
Average-case
5 2 4 6 1 3

Invert table 0 1 1 0 4 3

n:0
0+1 1
n−1: =
2 2
0 +1+ 2 3 2 1 2 3 n −1
n−2: = =1= 0 + + + + ... +
3 3 2 2 2 2 2
2
n −3:
0 +1+ 2 + 3 6 3
= = =
n(n − 1)
= n −n
4 4 2 4 4
0 + 1 + 2 + 3 + 4 10 4
n−4: = =2=
5 5 2

0 + 1 + 2 + 3 + ...(n − 1) n − 1
1: =
n 2
Order of Growth
. In some particular cases, we
shall be interested in
Average-case, or expect
running time of an algorithm
.

. However, Rate of growth, or


Order of growth, of the
running time that really
interests us
Analysis
. Simplifications
. Ignore actual and abstract
statement costs
. Order of growth is the
interesting measure:
. Highest-order term is what counts
. Remember, we are doing
asymptotic analysis
. As the input size grows larger it
is the high order term that
dominates
Worst-case n
T (n) = c1n + c2 (n − 1) + c4 (n − 1) + c5 ∑ t j +
j =2
n n
c6 ∑ (t j − 1) + c7 ∑ (t j − 1) + c8 (n − 1)
j =2 j =2

t j = j for i = 2,3,…,n
.

n( n + 1)
T ( n) = c1n + c2 ( n − 1) + c4 ( n − 1) + c5 ( − 1) +
2
n( n − 1) n( n − 1)
c6 ( ) + c7 ( ) + c8 ( n − 1)
2 2
c5 + c6 + c7 2 c5 − c6 − c7
=( ) n − (c1 + c2 + c4 + + c8 ) n
2 2
− (c2 + c4 + c5 + c8 ) . quadratic function
Growth of Function
.
The coefficients don’t affect
as much as the rate of growth
.

.
A problem P
1. Algorithm 1 solves P in n days
2. Algorithm 2 solves P in 2n
seconds
.
Which one is faster?
“Algorithm 1 runs faster
for n > 20”
→220=1048576 secs
→n x 12 x 60 x 60 =864000 secs
So

小時候胖不是胖
Asymptotic Notation

O, Ω, Θ, o, ω
Asymptotic Tightly Upper Bound

O( g (n)) = { f (n) | ∃c > 0, n0 > 0 s.t. 0 ≤ f (n) ≤ cg (n), ∀n ≥ n0 }

cg(n)

f(n)
f ( n ) = O( g( n ))

n
n0
How to read and view the notation?

2
O (n )

⇒ big-Oh of n2 , order n2
⇒ “Set“ of Function
How to read and view the notation?

2
f ( n) = O ( n )
“=“ : abuse
“ ∈ “ belong to
⇒f(n) ∈ big-Oh of (n 2
)
⇒f(n) doesn’t grow faster
than 2 n
2
O ( n ) = f ( n)

Asymptotic Tightly Lower Bound

Ω( g (n)) = { f (n) | ∃c > 0, n0 > 0 s.t. 0 ≤ cg (n) ≤ f (n), ∀n ≥ n0 }

f(n)
f ( n ) = Ω( g( n ))
cg(n)

n
n0
Asymptotic Tightly Bound

Θ( g (n)) = { f (n) | ∃ c1 > 0, c2 > 0, n0 > 0 s.t. 0 ≤ c1 g (n) ≤ f (n) ≤ c2 g (n)


,∀ n ≥ n0 }

c2g(n)

f(n)
f ( n ) = Θ( g( n ))
c1g(n)

n
n0
Theorem
. For any two functions f(n)
f ( n ) = Θand
( g( n )) g(n), f ( n ) = Oif
( g(and
n ))
only fif ( n ) = Ω( g( n )) and
Asymptotic Upper Bound

o( g (n)) = { f (n) | ∀c > 0, ∃n0 > 0,0 ≤ f (n) < cg (n),∀n > n0 }

cg(n)

f(n)
f (n) = o( g (n))

n
n0
Asymptotic Lower Bound

ω ( g (n)) = { f (n) | ∀c > 0, ∃n0 > 0, s.t. 0 ≤ cg (n) < f (n), ∀n ≥ n0 }

f(n)

cg(n)
f (n) = ω ( g (n))

n
n0
Examples
n = O(n)
n ≤ c × n, c = 1, n ≥ 1
100n = O(n)
100n ≤ c × n, c = 100, n ≥ 1
2 3 2
100n = O(n − n )
2 3 2
100n ≤ 100 × (n − n ), Let c = 100, n ≥ 2
2 3
200n ≤ 100 × n , Let c = 100, n ≥ 2
2
n
Example: 2
2
− 3n = Θ( n )

2
n
c1 × n 2 ≤ − 3n ≤ c2 × n 2
2
1 3
⇒ c1 ≤ − ≤ c2
2 n
1
⇒ (1) n ≥ 7, ≤ c2
2
1 3 7 6 1
⇒ (2) c1 ≤ − = − =
2 7 14 14 14
n2 n2 n2 1 1
⇒ ≤ − 3n ≤ , Let n ≥ 7, c1 = , c2 =
14 2 2 14 2
Example:an + bn + c = O(n ) 2 2

pn 2 + qn + r
≤ (| p | + | q | + | r |)n 2 + (| p | + | q | + | r |)n + (| p | + | q | + | r |)
≤ 3(| p | + | q | + | r |) × n 2 , c = 3(| p | + | q | + | r |), n ≥ 1
⇒ O(n 2 )
. In general,
d i
p (n) = ∑i =0 ai n , where ai are constant
d
Then p(n) = O(n ).
Order of Growth
Order of Growth
Order of Growth
Order of Growth
Quick Sort
Prof. Shin-Hung Chang
Quicksort
. Sorts in place
. Sorts O(n log n) in the
average case
. Sorts O(n2) in the worst case
. So why would people use it
instead of merge sort?
Quicksort
Quicksort(A, p, r)
{
if (p < r)
{
j = Partition(A, p, r);
Quicksort(A, p, j);
Quicksort(A, j+1, r);
}
}
Partition
. Clearly, all the action takes
place in the partition() function
. Rearranges the subarray in place
. End result:
. Two subarrays
. All values in first subarray ≤ all
values in second
.
Returns the index of the “pivot”
element separating the two
subarrays
. How do you suppose we
implement this function?
Partition In Words
. Partition(A, p, r):
. Select an element to act as
the “pivot” (which?)
. Grow two regions, A[p..j] and
A[j+1..r]
.
All elements in A[p..j] ≦ pivot
.
All elements in A[j+1..r] ≧ pivot
.
Increment i until A[i] >=
pivot
. Decrement j until A[j] < pivot
. Swap A[i] and A[j]
. Repeat until i >= j
. Return j
Partition(A, 1, 10)
Partition(A, p, r) piv
p r
{x = A[p]; ot
= 2 3 4 5 6 7=
1
i = p - 1; x
j = r + 1; A 6 81 9 107 9 3 2 4 1
1 8
==
while (TRUE) 6 i 4 0 j
{ repeat x
j--; A 1 1 1 8 7 9 3 2 4 6
=
= 4 0
until A[j] < x; 6 i j
repeat x
i++; A 1 4 2 8 7 9 3 1 1 6
=
=
until A[i] >= x; i 0j 4
6
if (i < j)
Swap(A, i, j); A 1 4 2 8 7 9 3 1 1 6
else = 0 4
i j
return j;}
} A 1 4 2 3 7 9 8 1 1 6
= 0 4
j i
Partition Code
Partition(A, p, r)
{ x = A[p];
i = p - 1;
j = r + 1;
while (TRUE)
repeat
j--;
until A[j] < x;
repeat
i++;
until A[i] >= x;
if (i < j)
Θ(n
else
Swap(A, i, j);
)
return j;
} What is the
running time of
partition()?
Analyzing Quicksort
. In the worst
case: T (n) = T (n − 1) + cn
T(1) = Θ(1) = T(n - 2) + 2cn
T(n) = T(n - 1) = T(n - 3) + 3cn
+ Θ(n)
= T(n - (n - 1)) + (n - 1)cn
. Time Complexity = T(1) + (n - 1)cn
T(n) = Θ(n2) = Θ(1) + Θ(n 2 )
= Θ(n 2 )
Analyzing Quicksort
. In the best case:
T(n) = 2T(n/2) +
Θ(n)

. Time Complexity The Master Theorem :


T(n) = Θ(n lg n)a = 2, b = 2
⇒ n logb a = n log 2 2
⇒ n = Θ ( n)
Θ(n logb a × log n) = Θ(n log n)
Improving Quicksort
. The real liability of quicksort is
that it runs in O(n2) on already-
sorted input
. Discuss two solutions:
. Randomize input array
. Random pivot element
. How will these solve the
problem?
. By insuring that no particular input
can be chosen to make quicksort run
in O(n2) time
Binary Search
Prof. Shin-Hung Chang
Binary Search
.
Given a value and a sorted array
a[], find index i such that a[i] = value,
or report that no such index exists.
.

.
6 1 1 2 3 4 5 5 6 7 8 9 9 9 9
. Invariant
0
3 4 5 :3a[lo]
1 2
3 1 ≤3 value
3 4
4 2 4≤ 3a[hi]
5 6
5 6 7
7 8 9 10 11 12 13 14

. l h
. o Binary search for 33.
Ex. i
Binary Search
6 1 1 2 3 4 5 5 6 7 8 9 9 9 9
0
3 4 5 3 3 1 3 4 2 4 3 5 6 7
1 2 3 4 5 6 7 8 9 10 11 12 13 14

l m h
o i i
6 1 1 2 3 4 5 d
5 6 7 8 9 9 9 9
0
3 4 5 3 3 1 3 4 2 4 3 5 6 7
1 2 3 4 5 6 7 8 9 10 11 12 13 14

l h
o i
6 1 1 2 3 4 5 5 6 7 8 9 9 9 9
0
3 4 5 3 3
1 2 3 4 5
1 3 4 2 4 3 5 6 7
6 7 8 9 10 11 12 13 14

l m h
o i i
d
Time Complexity
. (1)Sorting: O(nlogn)

.
(2) Search: O(log n)

You might also like