Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Dsa Unit-1

Télécharger au format pdf ou txt
Télécharger au format pdf ou txt
Vous êtes sur la page 1sur 70

Data Structures

Dr. S. Rakesh
Assistant Professor
IT Department

UNIT-1
Part-1

1
• Data Structures: Once we have data in
variables, we need some mechanism for
manipulating that data to solve problems. Data
structure is a particular way of storing and
organizing data in a computer so that it can be
used efficiently. A data structure is a special
format for organizing and storing data.

• Depending on the organization of the


elements, data structures are classified into two
types:

• Linear data structures: Elements are


accessed in a sequential order but it is not
compulsory to store all elements
sequentially. Examples: Linked Lists, Stacks
and Queues.
• Non – linear data structures: Elements of this
data structure are stored/accessed in a non-linear
order. Examples: Trees and graphs.

2
3
Abstract Data Types (ADTs):
To simplify the process of solving problems, we combine the data structures with their
operations and we call this Abstract Data Types (ADTs). An ADT consists of two parts:
1. Declaration of data
2. Declaration of operations.
Commonly used ADTs include: Linked Lists, Stacks, Queues, Priority Queues, Binary
Trees, Dictionaries, Disjoint Sets (Union and Find), Hash Tables, Graphs, and many
others. While defining the ADTs do not worry about the implementation details. They
come into the picture only when we want to use them.

4
What is an Algorithm?

• An algorithm is the step-by-step unambiguous instructions to solve a


given problem.
• In the traditional study of algorithms, there are two main criteria for
judging the merits of algorithms:
• correctness
• efficiency

5
Characteristics of Algorithm

6
• Why the Analysis of Algorithms?
• Goal of the Analysis of Algorithms
The goal of the analysis of algorithms is to compare algorithms mainly in terms of
running time but also in terms of other factors (e.g., memory, developer effort, etc.)
• What is Running Time Analysis?
It is the process of determining how processing time increases as the size of the
problem (input size) increases. Input size is the number of elements in the input, and
depending on the problem type, the input may be of different types.
The following are the common types of inputs.
 Size of an array
 Polynomial degree
 Number of elements in a matrix
 Number of bits in the binary representation of the input
 Vertices and edges in a graph.
7
• How to Compare Algorithms
To compare algorithms, let us define a few objective measures:
Execution times? Not a good measure as execution times are specific to a particular
computer.

Number of statements executed? Not a good measure, since the number of statements
varies with the programming language as well as the style of the individual programmer.

Ideal solution? Let us assume that we express the running time of a given algorithm as a
function of the input size n (i.e., f(n)) and compare these different functions
corresponding to running times. This kind of comparison is independent of machine
time, programming style, etc.

8
Types of Analysis: To analyze the given algorithm, we need to know with which
inputs the algorithm takes less time and with which inputs the algorithm takes a long
time. We have already seen that an algorithm can be represented in the form of an
expression. That means we represent the algorithm with multiple expressions: one
for the case where it takes less time and another for the case where it takes more
time. In general, the first case is called the best case and the second case is called
the worst case for the algorithm.
To analyze an algorithm we need some kind of syntax, and that forms the base for
asymptotic analysis/notation.

9
There are three types of analysis:
• Worst case:
• Defines the input for which the algorithm takes a long time (slowest time to
complete).
• Input is the one for which the algorithm runs the slowest.
• Best case:
• Defines the input for which the algorithm takes the least time (fastest time to
complete).
• Input is the one for which the algorithm runs the fastest.
• Average case:
• Provides a prediction about the running time of the algorithm.
• Run the algorithm many times, using many different inputs that come from
some distribution that generates these inputs, compute the total running time
(by adding the individual times), and divide by the number of trials.
• Assumes that the input is random.
Lower Bound <= Average Time <= Upper Bound
10
• What is Rate of Growth?
The rate at which the running time increases as a function of input is called rate of
growth.
• Commonly Used Rates of Growth

11
12
Asymptotic Notation
Having the expressions for the best, average and worst cases, for all three cases we
need to identify the upper and lower bounds. To represent these upper and lower
bounds, we need some kind of syntax, and that is the subject of the following
discussion. Let us assume that the given algorithm is represented in the form of
function f(n).

Big-O Notation [Upper Bounding Function]


This notation gives the tight upper bound of the given function. Generally, it is
represented as f(n) = O(g(n)). That means, at larger values of n, the upper bound of
f(n) is g(n). For example, if f(n) = n 4 + 100n 2 + 10n + 50 is the given algorithm, then n 4
is g(n). That means g(n) gives the maximum rate of growth for f(n) at larger values of n.

Big-O Visualization O(g(n)) is the set of functions with smaller or the same order of
growth as g(n).
For example; O(n 2 ) includes O(1), O(n), O(nlogn), etc.
13
Big-O Examples
Example-1: Find upper bound for f(n) = 3n + 8
Solution: 3n + 8 ≤ 4n, for all n ≥ 8
∴ 3n + 8 = O(n) with c = 4 and n0 = 8

Example-2: Find upper bound for f(n) = n 2 + 1


Solution: n 2 + 1 ≤ 2n 2 , for all n ≥ 1
∴ n 2 + 1 = O(n 2 ) with c = 2 and n0 = 1

Example-3: Find upper bound for f(n) = n 4 + 100n 2 + 50


Solution: n 4 + 100n 2 + 50 ≤ 2n 4 , for all n ≥ 11
∴ n 4 + 100n 2 + 50 = O(n 4 ) with c = 2 and n0 = 11

Example-4: Find upper bound for f(n) = 2n 3 – 2n 2


Solution: 2n 3 – 2n 2 ≤ 2n 3 , for all n > 1
∴ 2n 3 – 2n 2 = O(n 3 ) with c = 2 and n0 = 1
14
Omega-Q Notation [Lower Bounding Function]
Similar to the O discussion, this notation gives the tighter lower bound of the given
algorithm and we represent it as f(n) = Ω(g(n)). That means, at larger values of n, the
tighter lower bound of f(n) is g(n). For example, if f(n) = 100n2 + 10n + 50, g(n) is Ω(n2 ).

Ω Examples
Example-1: Find lower bound for f(n) = 5n2 .
Solution: ∃ c, n0 Such that: 0 ≤ cn2≤ 5n2 ⇒ cn2 ≤ 5n2 ⇒ c = 5 and n0 = 1
∴ 5n2 = Ω(n2 ) with c = 5 and n0 = 1

Example-2: Prove f(n) = 100n + 5 ≠ Ω(n2 ).


Solution: ∃ c, n0 Such that: 0 ≤ cn2 ≤ 100n + 5
100n + 5 ≤ 100n + 5n(∀n ≥ 1) = 105n
cn2 ≤ 105n ⇒ n(cn - 105) ≤ 0
Since n is positive ⇒cn - 105 ≤0 ⇒ n ≤105/c
⇒ Contradiction: n cannot be smaller than a constant.
15
Theta-Θ Notation [Order Function]
This notation decides whether the upper and lower bounds of a given function
(algorithm) are the same. The average running time of an algorithm is always between
the lower bound and the upper bound. If the upper bound (O) and lower bound (Ω) give
the same result, then the Θ notation will also have the same rate of growth.

Θ Examples

Example 1: Find Θ bound for


Solution: for all, n ≥ 2

∴ with c1 = 1/5,c2 = 1 and n0 = 2

16
Example 2: Prove n ≠ Θ(n2 )
Solution: c1 n2 ≤ n ≤ c2n2 ⇒ only holds for: n ≤ 1/c1
∴ n ≠ Θ(n 2 )

Example 3: Prove 6n3 ≠ Θ(n2 )


Solution: c1 n2≤ 6n3 ≤ c2 n2 ⇒ only holds for: n ≤ c2 /6
∴ 6n3 ≠ Θ(n2 )

Example 4: Prove n ≠ Θ(logn)


Solution: c1 logn ≤ n ≤ c2 logn ⇒ c2 ≥ , ∀ n ≥ n0 – Impossible

• we generally focus on the upper bound (O) because knowing the lower bound (Ω)
of an algorithm is of no practical importance, and we use the Θ notation if the
upper bound (O) and lower bound (Ω) are the same.

17
Guidelines for Asymptotic Analysis
There are some general rules to help us determine the running time of an algorithm.

1) Loops: The running time of a loop is, at most, the running time of the statements
inside the loop (including tests) multiplied by the number of iterations.

Total time = a constant c × n = c n = O(n).

18
2) Nested loops: Analyze from the inside out. Total running time is the product of the
sizes of all the loops.

Total time = c × n × n = cn2 = O(n2 ).

19
3) Consecutive statements: Add the time complexities of each statement.

Total time = c0 + c1n + c2n2 = O(n2 ).

20
4) If-then-else statements: Worst-case running time: the test, plus either the then part
or the else part (whichever is the larger).

Total time = c0 + c1 + (c2 + c3 ) * n = O(n).

21
5) Logarithmic complexity: An algorithm is O(logn) if it takes a constant time to cut the
problem size by a fraction (usually by ½). As an example let us consider the following
program:

If we observe carefully, the value of i is doubling every time. Initially i = 1, in next step i
= 2, and in subsequent steps i = 4,8 and so on. Let us assume that the loop is executing
some k times. At k th step 2 k = n, and at (k + 1) th step we come out of the loop. Taking
logarithm on both sides, gives

Total time = O(logn).


22
Simplifying properties of asymptotic notations
• Transitivity: f(n) = Θ(g(n)) and g(n) = Θ(h(n)) ⇒ f(n) = Θ(h(n)). Valid for O and Ω as well.

• Reflexivity: f(n) = Θ(f(n)). Valid for O and Ω.

• Symmetry: f(n) = Θ(g(n)) if and only if g(n) = Θ(f(n)).

• Transpose symmetry: f(n) = O(g(n)) if and only if g(n) = Ω(f(n)).

• If f(n) is in O(kg(n)) for any constant k > 0, then f(n) is in O(g(n)).

• If f1 (n) is in O(g1 (n)) and f2 (n) is in O(g2 (n)), then (f1 + f2 )(n) is in O(max(g1 (n)), (g1
(n))).

• If f1 (n) is in O(g1 (n)) and f2 (n) is in O(g2 (n)) then f1 (n) f2 (n) is in O(g1 (n) g1 (n)).

23
Recursion

24
What is Recursion?
Any function which calls itself is called recursive. A recursive method solves a problem
by calling a copy of itself to work on a smaller problem. This is called the recursion step.
The recursion step can result in many more such recursive calls.

Why Recursion?
Recursion is a useful technique borrowed from mathematics. Recursive code is
generally shorter and easier to write than iterative code. Generally, loops are turned
into recursive functions when they are compiled or interpreted.
Recursion is most useful for tasks that can be defined in terms of similar subtasks. For
example, sort, search, and traversal problems often have simple recursive solutions.

25
Format of a Recursive Function
A recursive function performs a task in part by calling itself to perform the subtasks. At
some point, the function encounters a subtask that it can perform without calling itself.
This case, where the function does not recur, is called the base case. The former, where
the function calls itself to perform a subtask, is referred to as the ecursive case. We can
write all recursive functions using the format:

26
Example:

27
Recursion and Memory (Visualization)
Each recursive call makes a new copy of that method (actually only the variables) in
memory. Once a method ends (that is, returns some data), the copy of that returning
method is removed from memory. The recursive solutions look simple but visualization
and tracing takes time. For better understanding, let us consider the following example.

28
If n=4

29
For Factorial

30
Recursion versus Iteration
While discussing recursion, the basic question that comes to mind is: which way is
better? – iteration or recursion? The answer to this question depends on what we are
trying to do. A recursive approach mirrors the problem that we are trying to solve. A
recursive approach makes it simpler to solve a problem that may not have the most
obvious of answers. But, recursion adds overhead for each recursive call (needs space
on the stack frame).

Recursion
• Terminates when a base case is reached.
• Each recursive call requires extra space on the stack frame (memory).
• If we get infinite recursion, the program may run out of memory and result in stack
overflow.
• Solutions to some problems are easier to formulate recursively.

31
Iteration
• Terminates when a condition is proven to be false.
• Each iteration does not require extra space.
• An infinite loop could loop forever since there is no extra memory being created.
• Iterative solutions to a problem may not always be as obvious as a recursive solution.
Notes on Recursion
• Recursive algorithms have two types of cases, recursive cases and base cases.
• Every recursive function case must terminate at a base case.
• Generally, iterative solutions are more efficient than recursive solutions [due to the
overhead of function calls].
• A recursive algorithm can be implemented without recursive function calls using a
stack, but it’s usually more trouble than its worth. That means any problem that can
be solved recursively can also be solved iteratively.
• For some problems, there are no obvious iterative algorithms.
• Some problems are best suited for recursive solutions while others are not.
32
Example Algorithms of Recursion
• Fibonacci Series, Factorial Finding
• Merge Sort, Quick Sort
• Binary Search
• Tree Traversals and many Tree Problems:
• In Order
• Pre Order
• Post Order
• Graph Traversals:
• DFS [Depth First Search]
• BFS [Breadth First Search]
• Dynamic Programming Examples
• Divide and Conquer Algorithms
• Towers of Hanoi
• Backtracking Algorithms

33
Divide and Conquer Algorithm

Examples:

Merge sort

Quick sort

Fibonacci
series

34
Examples for Iteration : Fibonacci series

Using
“while”

Using
“for”
35
Using “IF-
ELSE”

To Find Fibonacci Term:

36
Towers of Hanoi

37
Algorithm for Towers of Hanoi:

Step-1: Shift top n-1 disks to help rod B //Recursive call

Step-2: Shift last disk from Initial rod A to Final rod C

Step-3: Shift n-1 disks from helper rod B to Final rod C // Recursive call

38
Another Example of Recursion: Towers of Hanoi

39
Sorting

40
 Sorting?
Sorting is an algorithm that arranges the elements of a list in a certain order [either
ascending or descending]. The output is a permutation or reordering of the input.

 Why Sorting?
Sorting is one of the important categories of algorithms in computer science and a lot of
research has gone into this category. Sorting can significantly reduce the complexity of a
problem, and is often used for database algorithms and searches.

 Classification of Sorting Algorithms


Sorting algorithms are generally categorized based on the following parameters
 By Number of Comparisons
 By Number of Swaps
 By Memory Usage
 By Recursion
 By Stability
 By Adaptability 41
Other Classifications
Another method of classifying sorting algorithms is:
• Internal Sort
• External Sort

Internal Sort
Sort algorithms that use main memory exclusively during the sort are called internal
sorting algorithms. This kind of algorithm assumes high-speed random access to all
memory.

External Sort
Sorting algorithms that use external memory, such as tape or disk, during the sort
come under this category.

42
Bubble Sort

43
Performance
• Worst case complexity : O(n2 )
• Best case complexity (Improved version) : O(n)
• Average case complexity (Basic version) : O(n2 )
• Worst case space complexity : O(1) auxiliary 44
Selection Sort

45
Advantages
• Easy to implement
• In-place sort (requires no
additional storage space)
Disadvantages
• Doesn’t scale well: O(n2 )

Performance
• Worst case complexity : O(n2 )
• Best case complexity : O(n2 )
• Average case complexity : O(n2 )
• Worst case space complexity:
O(1) auxiliary

46
Insertion Sort

47
Advantages
• Simple implementation
• Efficient for small data
• Adaptive
• Stable
• In-place
• Online

Performance
• Worst case complexity: Θ(n2 )
• Best case complexity: Θ(n)
• Average case complexity: Θ(n2 )
• Worst case space complexity:
O(n2 ) total, O(1) auxiliary

48
Merge sort

49
50
Performance
• Worst case
complexity :
Θ(nlogn)
• Best case
complexity :
Θ(nlogn)
• Average case
complexity :
Θ(nlogn)
• Worst case
space
complexity:
Θ(n) auxiliary
51
52
53
Performance
• Worst case Complexity: O(n2 )
• Best case Complexity: O(nlogn)
• Average case Complexity:
O(nlogn)
• Worst case space Complexity:
O(1)

54
Heap sort

55
56
Performance
• Worst case
performance:
Θ(nlogn)
• Best case
performance:
Θ(nlogn)
• Average case
performance:
Θ(nlogn)
• Worst case space
complexity: Θ(n) total,
Θ(1) auxiliary

57
Radix sort

58
59
Time Complexity: O(nd) ≈
O(n), if d is small.

60
Comparison of Sorting algorithms
We can learn several problem solving approaches using Sorting algorithms

Incremental approach: Bubble , Selection ,Insertion


Swapping techniques: Bubble, Selection
Sifting techniques: Insertion Comparison Based
Divide and conquer: Merge , Quick Techniques
Problem solving using Data Structure: Heap

Non comparison : Radix

Comparison of sorting techniques will base on different parameters: Time and Space
complexity, In Place, Stable.

61
62
In-Place sorting algorithms: A sorting algorithm is In-place if it does not use extra space
to manipulate the input but may require a small though extra space for its operation
In-Place sorting algorithms: Bubble , Insertion , quick , Selection, Heap, Radix
Sorting algorithms with extra space: Merge

Stable sorting algorithms: A sorting algorithm is stable if it does not change the order of
elements with same values
Stable sorting algorithms: Bubble, Insertion , Merge
Non-stable sorting algorithms: Selection, Quick , Heap, Radix

63
Searching

64
 Searching is the process of finding an item with specified properties from a
collection of items. The items may be stored as records in a database, simple data
elements in arrays, text in files, nodes in trees, vertices and edges in graphs, or they
may be elements of other search spaces.

 Why Searching?
We know that today’s computers store a lot of information. To retrieve this
information proficiently we need very efficient searching algorithms. That means, if
we keep the data in proper order, it is easy to search the required element. Sorting
is one of the techniques for making the elements ordered.

65
Types of Searching
• Unordered Linear Search
• Sorted/Ordered Linear Search
• Binary Search
• Interpolation search
• Binary Search Trees (operates on trees)
• Symbol Tables and Hashing
• String Searching Algorithms: Tries, Ternary Search and Suffix Trees

66
Unordered Linear Search

Time complexity:
O(n), in the worst case
we need to scan the
complete array.
Space complexity:
O(1).

67
•Base Case - O(1)
•Average Case - O(n)
•Worst Case -O(n)

68
Recursive Binary Search

69
Iterative Binary Search

Time Complexity:
O(logn).
Space Complexity: O(1)

70

Vous aimerez peut-être aussi