Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
33 views

Data Structure Algorithm CH 1

The document discusses data structures and algorithms. It begins by defining a data structure as the way of organizing and storing data to facilitate access and modification. It then categorizes data structures as primitive and non-primitive. Primitive data structures are basic structures like integers while non-primitive are more complex structures built from primitive ones, such as linked lists, trees, and graphs. The document also discusses algorithms, defining them as step-by-step instructions to solve problems and analyzing their properties and performance in terms of time and space complexity.

Uploaded by

tamirat zemenu
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views

Data Structure Algorithm CH 1

The document discusses data structures and algorithms. It begins by defining a data structure as the way of organizing and storing data to facilitate access and modification. It then categorizes data structures as primitive and non-primitive. Primitive data structures are basic structures like integers while non-primitive are more complex structures built from primitive ones, such as linked lists, trees, and graphs. The document also discusses algorithms, defining them as step-by-step instructions to solve problems and analyzing their properties and performance in terms of time and space complexity.

Uploaded by

tamirat zemenu
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 71

(ITec2083)

 Improve ability to solve problems abstractly.


 Improve ability to analyze algorithm.
2010E.C
 To exactly know, what is data structure? We
must know:
What is a computer program?

Some mysterious
processing
Input Output
-Computer program development process required
us to represent data in effective way.
4
 Data structure is structural representation of the logical
relationship existing between individual elements of data.

 In other words, a data structure is a way of organizing all


data items that considers not only the elements stored but
also their relationship to each other.

 It is a representation of data and the operations allowed on


that data.
It is a way to store and organize data in order to facilitate the
access and modifications.
Data Structure is the method of representing of logical
relationships between individual data elements related to
the solution of a given problem.
 Data structure affects the design of both structural &

functional aspects of a program.


Program = algorithm + Data Structure
Data Structure = Operation + Organized Data
 Storage structure is the representation of particular DS in

the memory of the computer.


 You know that a algorithm is a step by step procedure to

solve a particular function.


 That means, algorithm is a set of instruction written to
carry out certain tasks & the data structure is the way of
organizing the data with their logical relationship
retained.

 To develop a program of an algorithm, we should select


an appropriate data structure for that algorithm.

 To convert a formal algorithm to a program we use step


wise Refinement Techniques (mathematical model,
pseudo code, DS, C/C++ program etc.)
 Data structure are normally divided into two broad
categories:
Primitive Data Structure
Non-Primitive Data Structure
Data structure

Primitive DS Non-Primitive DS

Integer Float Character Pointer


Non-Primitive DS

Linear List Non-Linear List

Array Queue Graph Trees

Link List Stack


array

Linked list

queue
tree stack
 PDSs are basic structures and directly operated upon
by the machine instructions.
 In general, there are different representation on
different computers.
 Integer, Floating-point number, Character constants,
string constants, pointers etc, fall in this category.
 NPDSs are more sophisticated data structures.
 These are derived from the primitive data structures.
 The non-primitive data structures emphasize on

structuring of a group of homogeneous (same type) or


heterogeneous (different type) data items.
 The design of an efficient data structure must take

operations to be performed on the data structure.


 The most commonly used operation on data
structure are broadly categorized into following
types:
Create
Selection
Updating
Searching
Sorting
Merging
Destroy or Delete
 Linear: In Linear data structure, values are arrange in
linear fashion.
Array: Fixed-size
Linked-list: Variable-size
Stack: Add to top and remove from top
Queue: Add to back and remove from front
Priority queue: Add anywhere, remove the highest
priority
 Non-Linear: The data values in this structure are not
arranged in order.
Hash tables: Unordered lists which use a ‘hash
function’ to insert and search
Tree: Data is organized in branches.
Graph: A more general branching structure, with less
strict connection conditions than for a tree
 A primitive data structure is generally a basic
structure that is usually built into the language, such
as an integer, a float.
 A non-primitive data structure is built out of

primitive data structures linked together in


meaningful ways, such as a or a linked-list, binary
search tree, AVL Tree, graph etc.
 The choice of particular data model depends on two
consideration:
It must be rich enough in structure to represent the
relationship between data elements

The structure should be simple enough that one can


effectively process the data when necessary
 Definition:- ADT
In Object Oriented Programming data and the operations that manipulate
that data are grouped together in classes
Abstract Data Types (ADTs) stores data and allow various operations on
the data to access and change it.
A mathematical model, together with various operations defined on the
model
An ADT is a collection of data and associated operations for manipulating
that data
 Data Structures
Physical implementation of an ADT
data structures used in implementations are provided in a language
(primitive or built-in) or are built from the language constructs (user-
defined)
Each operation associated with the ADT is implemented by one or more
subroutines in the implementation
 ADTs support abstraction, encapsulation, and
information hiding.

 Abstraction is the structuring of a problem into well-


defined entities by defining their data and operations.

 The principle of hiding the used data structure and to


only provide a well-defined interface is known as
encapsulation.
 Every Collection ADT should provide a way to:
add an item
remove an item
find an item
retrieve an item or
access an item
 when implementing an ADT the operations and
behaviors are already specified
 Implementer’s first choice is what to use as the internal

storage container for the concrete data type


the internal storage container is used to hold the items in
the collection
often an implementation of an ADT

22
No single data structure works well for all
purposes, and so it is important to know the
strengths and limitations of several of them
 Collection with access only to the last element inserted
 Last in first out
 insert/push
 remove/pop Data4 Top

 top Data3
 make empty
Data2

Data1
 Collection with access only to the item that has been
present the longest
 Last in last out or first in first out
 enqueue, dequeue, front
 priority queues and dequeue

Front Back

Data1 Data2 Data3 Data4


 Items have a position in this Collection
Random access or not?
 Array Lists
internal storage container is native array

26
 A Flexible structure, because can grow and shrink on
demand.
Elements can be:
Inserted
Accessed
Deleted
At any position
last
first
 A Tree is a collection of elements called nodes.
 One of the node is distinguished as a root, along with

a relation (“parenthood”) that places a hierarchical


structure on the nodes. Root
Prepared
CENG 213by Alemu
Data Ginjo
Structures 29

ALGORITHM AND
ALGORITHM ANALYSIS
CENG 213 Data Structures 30

Algorithm
An algorithm is a clearly specified set of instructions to
be followed to solve a problem.
There can be more than one solution (more than one
algorithm) to solve a given problem.
It can be implemented using different programming
languages on different platforms.
An algorithm must be correct. It should correctly solve
the problem.
e.g. For sorting, this means even if (1) the input is
already sorted, or (2) it contains repeated elements.
Once we have a correct algorithm for a problem, we
have to determine the efficiency of that algorithm.
Prepared by Alemu Ginjo
CENG 213 Data Structures 31

Problems
Problem: is a task to be performed.
• Best thought of as inputs and matching outputs.
• Problem definition should include constraints on the
resources that may be consumed by any acceptable
solution.

Prepared by Alemu Ginjo


CENG 213 Data Structures

Algorithm Properties
An algorithm possesses the following properties:
It must be correct.
It must be composed of a series of concrete steps.
There can be no ambiguity as to which step will be
performed next.
It must be composed of a finite number of steps.
It must terminate.
A computer program is an instance, or concrete
representation, for an algorithm in some programming
language.
32 Prepared by Alemu Ginjo
Prepared
CENG 213by Alemu
Data Ginjo
Structures 33

Algorithmic Performance
There are two aspects of algorithmic performance:
Time
• Instructions take time.
• How fast does the algorithm perform?
• What affects its runtime? 
=> computer, compiler, algorithm used & input to the algorithm
Space
• Data structures take space
• What kind of data structures can be used?
• How does choice of data structure affect the runtime?

We will focus on time:


How to estimate the time required for an algorithm
How to reduce the time required
Prepared
CENG 213by Alemu
Data Ginjo
Structures 34

Analysis of Algorithms
Algorithm analysis refers to the process of determining
the amount of computing time and storage space required
by different algorithms.
In other words, it’s a process of predicting the resource
requirement of algorithms in a given environment.
In order to solve a problem, there are many possible
algorithms.
One has to be able to choose the best algorithm for the
problem at hand using some scientific method
CENG 213by
Prepared Data Structures
Alemu Ginjo 35

Analysis of Algorithms(cont.)
- To classify some data structures and algorithms as
good, we need precise ways of analyzing them in terms
of resource requirement.
- The main resources are:
Running Time
Memory Usage
Communication Bandwidth
Prepared
CENG 213by Alemu
Data Ginjo
Structures 36

Analysis of Algorithms(cont.)
Running time is usually treated as the most important
since computational time is the most precious resource
in most problem domains.
There are two approaches to measure the efficiency of
algorithms:
• Empirical: Programming competing algorithms
and trying them on different instances.
• Theoretical: Determining the quantity of resources
required mathematically (Execution time, memory
space, etc.) needed by each algorithm.
Prepared
CENG 213by Alemu
Data Ginjo
Structures 37

Analysis of Algorithms(cont.)
However, it is difficult to use actual clock-time as a
consistent measure of an algorithm’s efficiency, because
clock-time can vary based on many things. For example,
• Specific processor speed
• Current processor load
• Specific data for a particular run of the program
Input size
Input properties
• Operating Environment
Prepared
CENG 213by Alemu
Data Ginjo
Structures 38

Analysis of Algorithms(cont.)
Accordingly, we can analyze an algorithm according to
the number of operations required, rather than
according to an absolute amount of time involved.
This can show how an algorithm’s efficiently changes
according to the size of the input.
CENG 213by
Prepared Data Structures
Alemu Ginjo 39

Complexity Analysis
• Complexity Analysis is the systematic study of the cost of
computation, measured either in time units or in operations
performed, or in the amount of storage space required.
• The goal is to have a meaningful measure that permits
comparison of algorithms independent of operating
platform.
I.Time Complexity: Determine the approximate number of
operations required to solve a problem of size n.
II.Space Complexity: Determine the approximate memory
required to solve a problem of size n.
Prepared
CENG 213by Alemu
Data Ginjo
Structures 40

Complexity Analysis
Complexity analysis involves two distinct phases:
1.Algorithm Analysis: Analysis of the algorithm or data
structure to produce a function T (n) that describes the
algorithm in terms of the operations performed in order to
measure the complexity of the algorithm.
2.Order of Magnitude Analysis: Analysis of the function T
(n) to determine the general complexity category to which it
belongs.
There is no generally accepted set of rules for algorithm
analysis.
However, an exact count of operations is commonly used.
Prepared
CENG 213by Alemu
Data Ginjo
Structures 41

Analysis Rules
Analysis Rules:
1. We assume an arbitrary time unit.
2. Execution of one of the following operations takes time 1:
• Assignment operation, Single Input/output operation.
• Single Boolean Operations. ex: AND, OR, NOR, NAND,
XOR, etc.
• Single Arithmetic Operations. ex: +, *, /, etc. and Function
Return.
3. Running time of a selection statement (if, switch) is the
time for the condition evaluation + the maximum of the
running times for the individual clauses in the selection.
Prepared
CENG 213by Alemu
Data Ginjo
Structures 42

Analysis Rules(cont.)
4. Loops: Running time for a loop is equal to the running
time for the statements inside the loop * number of iteration.
The total running time of a statement inside a group of
nested loops is the running time of the statements
multiplied by the product of the sizes of all the loops.
For nested loops, analyze inside out.
Always assume that the loop executes the maximum
number of iterations possible.
5. Running time of a function call is 1 for setup + the time
for any parameter calculations + the time required for the
execution of the function body.
CENG 213by
Prepared Data Structures
Alemu Ginjo 43

Analysis Rules(cont.)
Generally:-
1 for the assignment statement.
1 for the output statement.
1 for the input statement.
In the for loop:
1 assignment, n+1 tests, and n increments.
n loops of 2 units for an assignment, and an addition.
1 for the return statement.
CENG 213by
Prepared Data Structures
Alemu Ginjo 44

Examples: 1

int count( ){
int k=0; 1
cout<<”Enter an integer”; 1
cin>>n; 1
for ( i=0; i<n; i++) 1+n+1+n
k=k+1; 2n
return 0; 1
}
T(n) = 1+1+1+(1+n+1+n)+2n+1 = 4n+6 = O(n)
CENG 213by
Prepared Data Structures
Alemu Ginjo 45

2.
int total(int n )
{
int sum=0;
for (int i=1; i<=n; i++)
sum=sum+1;
return sum;
}

T(n) = 1+(1+n+1+n)+2n+1 = 4n+4 = O(n)


CENG 213by
Prepared Data Structures
Alemu Ginjo 46

3. void func() Time Units to Compute


{ -------------------------------------------------
int x=0; 1 for the first assignment statement: x=0;
int i=0; 1 for the second assignment statement: i=0;
int j=1; 1 for the third assignment statement: j=1;
cout<< “Enter an Integer value”; 1 for the output statement.
cin>>n;
while (i<n){
1 for the input statement.
x++; In the first while loop:
i++; n+1 tests
} n loops of 2 units for the two increment (addition) operations
while (j<n) In the second while loop:
{ n tests
j++; n-1 increments
} -------------------------------------------------------------------
}
T (n)= 1+1+1+1+1+n+1+2n+n+n-1 = 5n+5 = O(n)
CENG 213by
Prepared Data Structures
Alemu Ginjo 47

4. int sum (int n)


{
int partial_sum = 0;
for (int i = 1; i <= n; i++)
partial_sum = partial_sum +(i * i * i);
return partial_sum;
}
Time Units to Compute
-------------------------------------------------
1 for the assignment.
1 assignment, n+1 tests, and n increments.
n loops of 4 units for an assignment, an addition, and two multiplications.
1 for the return statement.
-------------------------------------------------------------------
T (n)= 1+(1+n+1+n)+4n+1 = 6n+4 = O(n)
Prepared
CENG 213by Alemu
Data Ginjo
Structures 48

Analysis of Algorithms(cont.)
• Analysis of Algorithms is the area of computer science
that provides tools to analyze the efficiency of different
methods of solutions.
• It predicts resources that an algorithm requires.
• How do we compare the time efficiency of two
algorithms that solve the same problem?
Naïve Approach: implement these algorithms in a
programming language (C++), and run them to compare
their time requirements.
Prepared
CENG 213by Alemu
Data Ginjo
Structures 49

Analysis of Algorithms(cont.)

Comparing the programs(instead of algorithms) has difficulties.


• How are the algorithms coded?
Comparing running times means comparing the
implementations.
We should not compare implementations, because they are
sensitive to programming style that may cloud the issue of
which algorithm is inherently more efficient.
• What computer should we use?
We should compare the efficiency of the algorithms
independently of a particular computer.
• What data should the program use?
Any analysis must be independent of specific data.
Prepared
CENG 213by Alemu
Data Ginjo
Structures 50

Analysis of Algorithms(cont.)
• When we analyze algorithms, we should employ
mathematical techniques that analyze algorithms
independently of specific implementations, computers,
or data.
• To analyze algorithms:
First, we start to count the number of significant
operations in a particular solution to assess its
efficiency.
Then, we will express the efficiency of algorithms
using growth functions.
Prepared
CENG 213by Alemu
Data Ginjo
Structures 51

The Execution Time of Algorithms


Each operation in an algorithm (or a program) has a cost.
Each operation takes a certain of time.

count = count + 1;  take a certain amount of time, but it is constant

A sequence of operations:

count = count + 1; Cost: c1


sum = sum + count; Cost: c2

 Total Cost = c1 + c2
Prepared
CENG 213by Alemu
Data Ginjo
Structures 52

The Execution Time of


Algorithms(cont.)
Example: Simple If-Statement
Cost Times
if (n < 0) c1 1
absval = -n c2 1
else
absval = n; c3 1

Total Cost <= c1 + max(c2,c3)


Prepared
CENG 213by Alemu
Data Ginjo
Structures 53

The Execution Time of


Algorithms(cont.)
Example: Simple Loop
Cost Times
i = 1; c1 1
sum = 0; c2 1
while (i <= n) { c3 n+1
i = i + 1; c4 n
sum = sum + i; c5 n
}

Total Cost = c1 + c2 + (n+1)*c3 + n*c4 + n*c5


 The time required for this algorithm is proportional to n
Prepared
CENG 213by Alemu
Data Ginjo
Structures 54

The Execution Time of


Algorithms(cont.)
Example: Nested Loop
Cost Times
i=1; c1 1
sum = 0; c2 1
while (i <= n) { c3 n+1
j=1; c4 n
while (j <= n) { c5 n*(n+1)
sum = sum + i; c6 n*n
j = j + 1; c7 n*n
}
i = i +1; c8 n
}

Total Cost = c1 + c2 + (n+1)*c3 + n*c4 + n*(n+1)*c5+n*n*c6+n*n*c7+n*c8


 The time required for this algorithm is proportional to n2
Prepared
CENG 213by Alemu
Data Ginjo
Structures 55

General Rules for Estimation


• Loops: The running time of a loop is at most the
running time of the statements inside of that loop times
the number of iterations.
• Nested Loops: Running time of a nested loop
containing a statement in the inner most loop is the
running time of statement multiplied by the product of
the sized of all loops.
• Consecutive Statements: Just add the running times of
those consecutive statements.
• If/Else: Never more than the running time of the test
plus the larger of running times of S1 and S2.
CENG 213by
Prepared Data Structures
Alemu Ginjo 56

Formal Approach to Analysis


In the above examples we have seen that analysis is a bit
complex.
However, it can be simplified by using some formal approach in
which case we can ignore initializations, loop control, and book
keeping.
1. For Loops: Formally
• In general, a for loop translates to a summation. The index and
bounds of the summation are the same as the index and bounds
of the for loop.
N


f
or(
i
nti
= 1
;i<=N;
i++
){
s
um= s
um+
i; 
1N
}

i1
Suppose we count the number of additions that are done. There
is 1 addition per iteration of the loop, hence N additions in total.
CENG 213by
Prepared Data Structures
Alemu Ginjo 57

2. Nested Loops: Formally


Nested for loops translate into multiple summations, one for each
for loop.

Again, count the number of additions. The outer summation is for


the outer for loop.
CENG 213by
Prepared Data Structures
Alemu Ginjo 58

3. Consecutive Statements: Formally


Add the running times of the separate blocks of your code
CENG 213by
Prepared Data Structures
Alemu Ginjo 59

4. Conditionals: Formally
If (test) s1 else s2: Compute the maximum of the running time
for s1 and s2.
CENG 213by
Prepared Data Structures
Alemu Ginjo 60

Example: Suppose we have hardware capable of executing 106


instructions per second. How long would it take to execute an
algorithm whose complexity function was:
T (n) = 2n2 on an input size of n=108?
The total number of operations to be performed would be T (108):

T(108) = 2*(108)2 =2*1016


The required number of seconds
required would be given by
T(108)/106 so:

Running time =2*1016/106 = 2*1010


•The number of seconds per day is 86,400 so this is about
231,480 days (634 years).
Prepared
CENG 213by Alemu
Data Ginjo
Structures 61

What to Analyze
An algorithm can require different times to solve
different problems of the same size.
E.g. Searching an item in a list of n elements using
sequential search.  Cost: 1,2,...,n
Worst-Case Analysis –The maximum amount of time
that an algorithm require to solve a problem of size n.
This gives an upper bound for the time complexity of
an algorithm.
Normally, we try to find worst-case behavior of an
algorithm.
Prepared
CENG 213by Alemu
Data Ginjo
Structures 62

What to Analyze (cont.)


Best-Case Analysis –The minimum amount of time that an
algorithm require to solve a problem of size n.
The best case behavior of an algorithm is NOT so useful.
Average-Case Analysis –The average amount of time that an
algorithm require to solve a problem of size n.
Sometimes, it is difficult to find the average-case behavior
of an algorithm.
We have to look at all possible data organizations of a
given size n, and their distribution probabilities of
these organizations.
Worst-case analysis is more common than average-case
analysis.
Prepared
CENG 213by Alemu
Data Ginjo
Structures 63

Measures of Times
• In order to determine the running time of an algorithm it is
possible to define three functions Tbest(n), Tavg(n) and
Tworst(n), the average and the worst case running time of the
algorithm respectively.
1.Average Case (Tavg): The amount of time the algorithm
takes on an “average” set of inputs.
Worst Case (Tworst): The amount of time the algorithm takes
on the worst possible set of inputs.
That is the longest running time for any input of size n.
2.Best Case (Tbest): The amount of time the algorithm takes
on the smallest possible set of inputs.
Prepared
CENG 213by Alemu
Data Ginjo
Structures 64

Asymptotic Analysis
Asymptotic analysis is concerned with how the running
time of an algorithm increases with the size of the input in
the limit, as the size of the input increases without bound.
There are five notations used to describe a running time
function. These are:
Big-Oh Notation (O)
Big-Omega Notation ()
Theta Notation ()
Little-o Notation (o)
Little-Omega Notation ()
CENG 213by
Prepared Data Structures
Alemu Ginjo 65

The Big-Oh Notation


Big-Oh notation is a way of comparing algorithms and is used for
computing the complexity of algorithms; i.e., the amount of time
that it takes for computer program to run.
It’s only concerned with what happens for very a large value of n.
Therefore only the largest term in the expression (function) is
needed.

For example, if the number of operations in an algorithm is n2 – n,


n is insignificant compared to n2 for large values of n.
Hence the n term is ignored. Of course, for small values of n, it may
be important.
However, Big-Oh is mainly concerned with large values of n.
CENG 213by
Prepared Data Structures
Alemu Ginjo 66

Formal Definition: f (n)= O (g (n)) if there exist c, k ∊ ℛ+


such that for all n ≥ k, f (n) ≤ c.g (n).
Examples: The following points are facts that you can use
for Big-Oh problems:
1<=n for all n>=1
n<=n2 for all n>=1
2n <=n! for all n>=4
log2n<=n for all n>=2
n<=nlog2n for all n>=2
CENG 213by
Prepared Data Structures
Alemu Ginjo 67

1. f(n)=10n+5 and g(n)=n. Show that f(n) is O(g(n)).


To show that f(n) is O(g(n)) we must show that
constants c and k such that
f(n) <=c.g(n) for all n>=k
Or 10n+5<=c.n for all n>=k
Try c=15. Then we need to show that 10n+5<=15n
Solving for n we get: 5<5n or 1<=n.
So f(n) =10n+5 <=15.g(n) for all n>=1.
(c=15,k=1).
CENG 213by
Prepared Data Structures
Alemu Ginjo 68

2. f(n) = 3n2 +4n+1. Show that f(n)=O(n2).


4n <=4n2 for all n>=1 and 1<=n2 for all n>=1
3n2 +4n+1<=3n2+4n2+n2 for all n>=1
<=8n2 for all n>=1
So we have shown that f(n)<=8n2 for all n>=1
Therefore, f (n) is O(n2) (c=8,k=1)
CENG 213by
Prepared Data Structures
Alemu Ginjo 69

Typical Orders

Here is a table of some typical cases. This uses logarithms to base 2, but these are simply
proportional to logarithms in other base.

N O(1) O(log n) O(n) O(n log n) O(n2) O(n3)


1 1 1 1 1 1 1
2 1 1 2 2 4 8
4 1 2 4 8 16 64
8 1 3 8 24 64 512
16 1 4 16 64 256 4,096
1024 1 10 1,024 10,240 1,048,576 1,073,741,824
CENG 213by
Prepared Data Structures
Alemu Ginjo 70

Demonstrating that a function f(n) is big-O of a function g(n)


requires that we find specific constants c and k for which the
inequality holds (and show that the inequality does in fact hold).
Big-O expresses an upper bound on the growth rate of a function,
for sufficiently large values of n.
An upper bound is the best algorithmic solution that has been
found for a problem.
“ What is the best that we know we can do?”
Exercise:
f(n) = (3/2)n2+(5/2)n-3
Show that f(n)= O(n2)
In simple words, f (n) =O(g(n)) means that the growth rate of f(n)
is less than or equal to g(n).
Prepared
CENG 213by Alemu
Data Ginjo
Structures 71

End of chapter 1

You might also like