Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Unit 4 ADSA 13 mark

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

Unit 4

13-mark questions:
1. Explain about dynamic programming and its elements.

Dynamic programming helps to solve a complex problem by breaking it down into a


collection of simpler subproblems, solving each of those sub-problems just once, and
storing their solutions.

A dynamic programming algorithm will examine the previously solved sub problems
and will combine their solutions to give the best solution for the given problem.

Dynamic programming is used where we have problems, which can be divided into
similar sub-problems, so that their results can be re-used. Mostly, these algorithms are
used for optimization. Before solving the in-hand sub-problem, dynamic algorithm will
try to examine the results of the previously solved sub-problems. The solutions of sub-
problems are combined in order to achieve the best solution. It is used when the solution
can be recursively described in terms of solutions to subproblems (optimal
substructure).

The following computer problems can be solved using dynamic programming approach

• Fibonacci sequence
• 0-1 knapsack problem
• Longest Common Subsequence problem
• Matrix chain multiplication
• All pairs shortest path problem
• Subset-sum problem
• Optimal Binary Search Tree

In the case of any dynamic programming problem, when we divide the problem, the
subproblems are not independent, instead the solutions to these subproblems are used
to compute the values for the larger problems.

These repeating subproblems are used again and again and hence their results are
stored. If the results of subproblems are not stored, we would have to compute them
again and again, which will be very costly.

So, even if we can solve a problem with plain iteration/recursion, dynamic


programming makes it faster and more efficient by using some extra space.

By storing the solutions of sub-problems in extra memory space, we thereby reduce the
time taken to calculate the complete solution for a problem. This method of storing the
answers to similar subproblems is called memoization.

Because of this, dynamic programming can solve a vast variety of problems involving
optimal solutions, which means finding the best case out of all. In these cases, also,
dynamic programming takes help of smaller subproblems to reach to the final optimal
solution.

In Dynamic Programming, we analyse all subproblems to find out the most optimal
solution, but because we store these values, no subproblem is recomputed. This
increases the efficiency of a Dynamic Programming solution and also assures a correct
solution as we are basically checking all the cases, but efficiently.

Dynamic Programming algorithm works by computing the answer to all possible


subproblems and stores them to find the final answer. This process of storing values of
subproblems is called memoization which helps in reaching the final solution
efficiently.

Characteristics of Dynamic Programming

Two characteristics that a problem has for a Dynamic Programming solution to work
are:

1. Optimal Substructure
2. Overlapping Subproblems

Optimal Substructure

If a complex problem has an Optimal Substructure property it means that to find an


optimal solution for the problem, the optimal solutions of its smaller subproblems are
needed.

So, by combining the optimal solutions of subproblems, we can obtain a solution


efficiently.

This property is used by the Dynamic Programming algorithm to ensure a correct


solution to the given problem.

Taking the example of the Fibonacci Sequence again, the nth term is given by :

F(n) = F(n-1) + F(n-2)

So, to find the nth term, we need to find the correct solution for the (n-1) th and
the (n-2)th term, which are smaller subproblems of the complete problem. And again
to find the (n-1)th term and the (n-2)th term, we need to find the solution to even
smaller subproblems.

Overlapping Subproblems

A problem is said to have Overlapping Subproblems property if it can be broken down


into smaller parts called subproblems, which need to be solved again and again. This
means that the same subproblem is needed again and again to generate the final
solution.
This property is used by Dynamic Programming in the sense that it stores the solutions
to these recurring subproblems so that we do not need to compute it again and again.

For example, in the Fibonacci Sequence problem, we have

F(n) = F(n-1) + F(n-2)


F(n-1) = F(n-2) + F(n-3)
F(n-2) = F(n-3) + F(n-4)
F(n-3) = F(n-4) + F(n-5)

From this, it is clear that the subproblems F(n-2), F(n-3), and F(n-4) are used again and
again to find the final solution.

Components of Dynamic Programming

The major components in any Dynamic Programming solution are:

1. Stages
2. States and state variables
3. State Transition
4. Optimal Choice

1. Stages

When a complex problem is divided into several subproblems, each subproblem forms
a stage of the solution. After calculating the solution for each stage and choosing the
best ones we get to the final optimized solution.

2. States and State Variables

Each subproblem can be associated with several states. States differ in their solutions
because of different choices. A state for a subproblem is therefore defined more clearly
based on a parameter, which is called the state variable. It is possible in some problems,
that one than one variable is needed to define a state distinctly. In these cases, there are
more than one state variables.

3. State Transition

State Transition simply refers to how one subproblem relates to other subproblems. By
using this state transition, we calculate our end solution.

4. Optimal Choice

At each stage, we need to choose the option which leads to the most desirable solution.
Choosing the most desirable option at every stage will eventually lead to an optimal
solution in the end.
Elements Of Dynamic Programming

Three elements of the Dynamic Programming algorithm are :

1. Substructure
2. Table Structure
3. Bottom-Up Computation (Memoization)

The elements in a Dynamic Programming Solution are discussed below:

• To solve a given complex problem and to find its optimal solution, it is broken
down into similar but smaller and easily computable problems called
subproblems. Hence, the complete solution depends on many smaller problems
and their solutions. We get to the final optimal solution after going through all
subproblems and selecting the most optimal ones. This is the substructure
element of any Dynamic Programming solution.
• Any Dynamic Programming solution involves storing the optimal solutions of
the subproblems so that they don't have to be computed again and again. To
store these solutions a table structure is needed. So, for example arrays in C++
or ArrayList in Java can be used. By using this structured table, the solutions of
previous subproblems are reused.
• The solutions to subproblems need to be computed first to be reused again. This
is called Bottom-Up Computation because we start storing values from the
bottom and then consequently upwards. The solutions to the smaller
subproblems are combined to get the final solution to the original problem.

2. Explain how to solve the following problems using dynamic programming with
example.

a) Longest Common Subsequence.

b) Matrix Chain Multiplication.

Longest Common Subsequence

LCS Problem Statement: Given two sequences, find the length of longest
subsequence present in both of them. A subsequence is a sequence that appears in the
same relative order, but not necessarily contiguous. For example, “abc”, “abg”, “bdf”,
“aeg”, ‘”acefg”, .. etc are subsequences of “abcdefg”.
The naive solution for this problem is to generate all subsequences of both given
sequences and find the longest matching subsequence. This solution is exponential
in term of time complexity. Let us see how this problem possesses both important
properties of a Dynamic Programming (DP) Problem.
1) Optimal Substructure:
Let the input sequences be X[0..m-1] and Y[0..n-1] of lengths m and n respectively.
And let L(X[0..m-1], Y[0..n-1]) be the length of LCS of the two sequences X and Y.
Following is the recursive definition of L(X[0..m-1], Y[0..n-1]).
If last characters of both sequences match (or X[m-1] == Y[n-1]) then
L(X[0..m-1], Y[0..n-1]) = 1 + L(X[0..m-2], Y[0..n-2])
If last characters of both sequences do not match (or X[m-1] != Y[n-1]) then
L(X[0..m-1], Y[0..n-1]) = MAX ( L(X[0..m-2], Y[0..n-1]), L(X[0..m-1], Y[0..n-2]) )
Examples:
1) Consider the input strings “AGGTAB” and “GXTXAYB”. Last characters match
for the strings. So length of LCS can be written as:
L(“AGGTAB”, “GXTXAYB”) = 1 + L(“AGGTA”, “GXTXAY”)

2) Consider the input strings “ABCDGH” and “AEDFHR. Last characters do not
match for the strings. So length of LCS can be written as:
L(“ABCDGH”, “AEDFHR”) = MAX ( L(“ABCDG”, “AEDFHR”), L(“ABCDGH”,
“AEDFH”) )
So the LCS problem has optimal substructure property as the main problem can be
solved using solutions to subproblems.
2) Overlapping Subproblems:
Following is simple recursive implementation of the LCS problem. The
implementation simply follows the recursive structure mentioned above .

#include <bits/stdc++.h>
using namespace std;
/* Returns length of LCS for X[0..m-1], Y[0..n-1] */
int lcs( char *X, char *Y, int m, int n )
{
if (m == 0 || n == 0)
return 0;
if (X[m-1] == Y[n-1])
return 1 + lcs(X, Y, m-1, n-1);
else
return max(lcs(X, Y, m, n-1), lcs(X, Y, m-1, n));
}
/* Driver code */
int main()
{
char X[] = "AGGTAB";
char Y[] = "GXTXAYB";
int m = strlen(X);
int n = strlen(Y);
cout<<"Length of LCS is "<< lcs( X, Y, m, n ) ;
return 0;
}
Output: Length of LCS is 4
Matrix Chain Multiplication
Given a sequence of matrices, find the most efficient way to multiply these matrices
together. The problem is not actually to perform the multiplications, but merely to
decide in which order to perform the multiplications.
We have many options to multiply a chain of matrices because matrix multiplication
is associative. In other words, no matter how we parenthesize the product, the result
will be the same. For example, if we had four matrices A, B, C, and D, we would
have:
(ABC)D = (AB)(CD) = A(BCD) = ....
However, the order in which we parenthesize the product affects the number of simple
arithmetic operations needed to compute the product, or the efficiency. For example,
suppose A is a 10 × 30 matrix, B is a 30 × 5 matrix, and C is a 5 × 60 matrix. Then,
(AB)C = (10×30×5) + (10×5×60) = 1500 + 3000 = 4500 operations
A(BC) = (30×5×60) + (10×30×60) = 9000 + 18000 = 27000 operations.
Clearly the first parenthesization requires less number of operations.
Given an array p[] which represents the chain of matrices such that the ith matrix Ai
is of dimension p[i-1] x p[i]. We need to write a function MatrixChainOrder() that
should return the minimum number of multiplications needed to multiply the chain.
Input: p[] = {10, 20, 30}
Output: 6000
There are only two matrices of dimensions 10x20 and 20x30. So there
is only one way to multiply the matrices, cost of which is 10*20*30
1) Optimal Substructure:
A simple solution is to place parenthesis at all possible places, calculate the cost for
each placement and return the minimum value. In a chain of matrices of size n, we
can place the first set of parenthesis in n-1 ways. For example, if the given chain is
of 4 matrices. let the chain be ABCD, then there are 3 ways to place first set of
parenthesis outer side: (A)(BCD), (AB)(CD) and (ABC)(D). So when we place a set
of parenthesis, we divide the problem into subproblems of smaller size. Therefore,
the problem has optimal substructure property and can be easily solved using
recursion.
Minimum number of multiplication needed to multiply a chain of size n = Minimum
of all n-1 placements (these placements create subproblems of smaller size)
2) Overlapping Subproblems
Following is a recursive implementation that simply follows the above optimal
substructure property.
Below is the implementation of the above idea:
#include <bits/stdc++.h>
using namespace std;
// Matrix Ai has dimension p[i-1] x p[i]
// for i = 1..n
int MatrixChainOrder(int p[], int i, int j)
{
if (i == j)
return 0;
int k;
int min = INT_MAX;
int count;
// place parenthesis at different places
// between first and last matrix, recursively
// calculate count of multiplications for
// each parenthesis placement and return the
// minimum count
for (k = i; k < j; k++)
{
count = MatrixChainOrder(p, i, k) + MatrixChainOrder(p, k + 1, j)+ p[i - 1] *
p[k] * p[j];
if (count < min)
min = count;
}
// Return minimum count
return min;
}
// Driver Code
int main()
{
int arr[] = { 1, 2, 3, 4, 3 };
int n = sizeof(arr) / sizeof(arr[0]);

cout << "Minimum number of multiplications is "


<< MatrixChainOrder(arr, 1, n - 1);
}
Output
Minimum number of multiplications is 30
3. Explain about Greedy Algorithm and its elements.
Among all the algorithmic approaches, the simplest and straightforward approach is
the Greedy method. In this approach, the decision is taken on the basis of current
available information without worrying about the effect of the current decision in
future.
Greedy algorithms build a solution part by part, choosing the next part in such a way,
that it gives an immediate benefit. This approach never reconsiders the choices taken
previously. This approach is mainly used to solve optimization problems. Greedy
method is easy to implement and quite efficient in most of the cases. Hence, we can
say that Greedy algorithm is an algorithmic paradigm based on heuristic that follows
local optimal choice at each step with the hope of finding global optimal solution.
In many problems, it does not produce an optimal solution though it gives an
approximate (near optimal) solution in a reasonable time.
Greedy algorithms have the following five components −
• A candidate set − A solution is created from this set.
• A selection function − Used to choose the best candidate to be added to the solution.
• A feasibility function − Used to determine whether a candidate can be used to
contribute to the solution.
• An objective function − Used to assign a value to a solution or a partial solution.
• A solution function − Used to indicate whether a complete solution has been reached.
Greedy approach is used to solve many problems, such as
• Finding the shortest path between two vertices using Dijkstra’s algorithm.
• Finding the minimal spanning tree in a graph using Prim’s /Kruskal’s algorithm, etc.
In many problems, Greedy algorithm fails to find an optimal solution, moreover it may
produce a worst solution. Problems like Travelling Salesman and Knapsack cannot be
solved using this approach.

Characteristics of Greedy method

The following are the characteristics of a greedy method:

• To construct the solution in an optimal way, this algorithm creates two sets where
one set contains all the chosen items, and another set contains the rejected items.
• A Greedy algorithm makes good local choices in the hope that the solution should
be either feasible or optimal.

Advantages
The biggest advantage that the Greedy algorithm has over others is that it is easy to

implement and very efficient in most cases.

Disadvantages of using Greedy algorithm

Greedy algorithm makes decisions based on the information available at each phase
without considering the broader problem. So, there might be a possibility that the
greedy solution does not give the best solution for every problem.

It follows the local optimum choice at each stage with a intend of finding the global
optimum.

Greedy Algorithm

1. To begin with, the solution set (containing answers) is empty.

2. At each step, an item is added to the solution set until a solution is reached.

3. If the solution set is feasible, the current item is kept.

4. Else, the item is rejected and never considered again.

Greedy is an algorithmic paradigm that builds up a solution piece by piece, always


choosing the next piece that offers the most obvious and immediate benefit. Greedy
algorithms are used for optimization problems. An optimization problem can be solved
using Greedy if the problem has the following property: At every step, we can make a
choice that looks best at the moment, and we get the optimal solution of the complete
problem.

If a Greedy Algorithm can solve a problem, then it generally becomes the best method
to solve that problem as the Greedy algorithms are in general more efficient than other
techniques like Dynamic Programming. But Greedy algorithms cannot always be
applied. For example, the Fractional Knapsack problem can be solved using Greedy,
but 0-1 Knapsack cannot be solved using Greedy.
Following are some standard algorithms that are Greedy algorithms.

1) Kruskal’s Minimum Spanning Tree (MST): In Kruskal’s algorithm, we create a MST


by picking edges one by one. The Greedy Choice is to pick the smallest weight edge
that doesn’t cause a cycle in the MST constructed so far.

2) Prim’s Minimum Spanning Tree: In Prim’s algorithm also, we create a MST by


picking edges one by one. We maintain two sets: a set of the vertices already included
in MST and the set of the vertices not yet included. The Greedy Choice is to pick the
smallest weight edge that connects the two sets.

3) Dijkstra’s Shortest Path: Dijkstra’s algorithm is very similar to Prim’s algorithm.


The shortest-path tree is built up, edge by edge. We maintain two sets: a set of the
vertices already included in the tree and the set of the vertices not yet included. The
Greedy Choice is to pick the edge that connects the two sets and is on the smallest
weight path from source to the set that contains not yet included vertices.

4) Huffman Coding: Huffman Coding is a loss-less compression technique. It assigns


variable-length bit codes to different characters. The Greedy Choice is to assign the
least bit length code to the most frequent character.

The greedy algorithms are sometimes also used to get an approximation for Hard
optimization problems. For example, Traveling Salesman Problem is an NP-Hard
problem. A Greedy choice for this problem is to pick the nearest unvisited city from the
current city at every step. These solutions don’t always produce the best optimal
solution but can be used to get an approximately optimal solution.

Activity Selection Problem


Given n activities with their start and finish times. Select the maximum number of
activities that can be performed by a single person, assuming that a person can only
work on a single activity at a time.
Example : Consider the following 6 activities sorted by by finish time.
start[] = {1, 3, 0, 5, 8, 5};
finish[] = {2, 4, 6, 7, 9, 9};
A person can perform at most four activities. The maximum set of activities that can
be executed is {0, 1, 3, 4} [ These are indexes in start[] and finish[] ]
The greedy choice is to always pick the next activity whose finish time is least
among the remaining activities and the start time is more than or equal to the finish
time of the previously selected activity. We can sort the activities according to their
finishing time so that we always consider the next activity as minimum finishing
time activity.
1) Sort the activities according to their finishing time
2) Select the first activity from the sorted array and print it.
3) Do the following for the remaining activities in the sorted array.
If the start time of this activity is greater than or equal to the finish time of the
previously selected activity then select this activity and print it.
4. Explain Huffman Coding with example.
Huffman coding is a lossless data compression algorithm. The idea is to assign
variable-length codes to input characters, lengths of the assigned codes are based on
the frequencies of corresponding characters. The most frequent character gets the
smallest code and the least frequent character gets the largest code.
The variable-length codes assigned to input characters are Prefix Codes, means the
codes (bit sequences) are assigned in such a way that the code assigned to one
character is not the prefix of code assigned to any other character. This is how
Huffman Coding makes sure that there is no ambiguity when decoding the generated
bitstream.
Example: Let there be four characters a, b, c and d, and their corresponding variable
length codes be 00, 01, 0 and 1. This coding leads to ambiguity because code assigned
to c is the prefix of codes assigned to a and b. If the compressed bit stream is 0001,
the de-compressed output may be “cccd” or “ccb” or “acd” or “ab”.
There are mainly two major parts in Huffman Coding
• Build a Huffman Tree from input characters.
• Traverse the Huffman Tree and assign codes to characters.
Steps to build Huffman Tree
Input is an array of unique characters along with their frequency of occurrences and
output is Huffman Tree.
1. Create a leaf node for each unique character and build a min heap of all leaf nodes
(Min Heap is used as a priority queue. The value of frequency field is used to
compare two nodes in min heap. Initially, the least frequent character is at root)
2. Extract two nodes with the minimum frequency from the min heap.
3. Create a new internal node with a frequency equal to the sum of the two nodes
frequencies. Make the first extracted node as its left child and the other extracted
node as its right child. Add this node to the min heap.
4. Repeat steps#2 and #3 until the heap contains only one node. The remaining node
is the root node and the tree is complete.
Example:
Character Frequency
a 5
b 9
c 12
d 13
e 16
f 45
Step 1: Build a min heap that contains 6 nodes where each node represents root of
a tree with single node.

Step 2: Extract two minimum frequency nodes from min heap. Add a new internal
node with frequency 5 + 9 = 14.

Huffman Coding step 2


Now min heap contains 5 nodes where 4 nodes are roots of trees with single
element each, and one heap node is root of tree with 3 elements

character Frequency
c 12
d 13
Internal Node 14
e 16
f 45
Step 3: Extract two minimum frequency nodes from heap. Add a new internal
node with frequency 12 + 13 = 25

Huffman Coding step 3

Now min heap contains 4 nodes where 2 nodes are roots of trees with single
element each, and two heap nodes are root of tree with more than one nodes

character Frequency
Internal Node 14
e 16
Internal Node 25
f 45
Step 4: Extract two minimum frequency nodes. Add a new internal node with
frequency 14 + 16 = 30
Huffman Coding step 4

Now min heap contains 3 nodes.

character Frequency
Internal Node 25
Internal Node 30
f 45
Step 5: Extract two minimum frequency nodes. Add a new internal node with
frequency 25 + 30 = 55

Huffman Coding step 5

Now min heap contains 2 nodes.

character Frequency
f 45
Internal Node 55
Step 6: Extract two minimum frequency nodes. Add a new internal node with
frequency 45 + 55 = 100

Huffman Coding step 6

Now min heap contains only one node.


character Frequency
Internal Node 100
Since the heap contains only one node, the algorithm stops here.

Steps to print codes from Huffman Tree:


Traverse the tree formed starting from the root. Maintain an auxiliary array. While
moving to the left child, write 0 to the array. While moving to the right child,
write 1 to the array. Print the array when a leaf node is encountered.

Steps to print codes from Huffman Tree

The codes are as follows:

character code-word
f 0
c 100
d 101
a 1100
b 1101
e 111

You might also like