1. 3.1. Introduction
The Dynamic Programming (DP) is the most powerful design technique for
solving optimization problems. It was invented by mathematician named Richard
Bellman inn 1950s. The DP in closely related to divide and conquer techniques,
where the problem is divided into smaller sub-problems and each sub-problem is
solved recursively. The DP differs from divide and conquer in a way that instead of
solving sub-problems recursively, it solves each of the sub-problems only once and
stores the solution to the sub-problems in a table. The solution to the main problem
is obtained by the solutions of these sub-problems.
The steps of Dynamic Programming technique are:
Dividing the problem into sub-problems: The main problem is divided
into smaller sub-problems. The solution of the main problem is expressed in
terms of the solution for the smaller sub-problems.
Storing the sub solutions in a table: The solution for each sub-problem is
stored in a table so that it can be referred many times whenever required.
Bottom-up computation: The DP technique starts with the smallest
problem instance and develops the solution to sub instances of longer size
and finally obtains the solution of the original problem instance.
Difference between Divide and Conquer and Dynamic Programming
S.N
o
Divide and Conquer Dynamic Programming
1 The divide-and-conquer paradigm involves
three steps at each level of the recursion:
• Divide the problem into a number of sub
problems.
• Conquer the sub problems by solving
them recursively. If the sub problem sizes
are small enough, however, just solve the
The development of a
dynamic-programming
algorithm can be broken into a
sequence of four steps.
a. Characterize the structure of
an optimal solution.
b. Recursively define the
2. sub problems in a straightforward manner.
• Combine the solutions to the sub
problems into the solution for the original
problem.
value of an optimal solution.
c. Compute the value of an
optimal solution in a bottom-
up fashion.
d. Construct an optimal
solution from computed
information
2 They call themselves recursively one or
more times to deal with closely related sub
problems.
Dynamic Programming is not
recursive.
3 D&C does more work on the sub-problems
and hence has more time consumption.
DP solves the sub problems
only once and then stores it in
the table.
4 In D&C the sub problems are independent
of each other.
In DP the sub-problems are
not independent.
5 Example: Merge Sort, Binary Search Example : Matrix chain
multiplication
Steps of Dynamic Programming
Dynamic programming design involves four major steps:
1. Characterize the structure of optimal solution. That means develop a
mathematical notation that can express any solution and subsolution for the
given problem.
2. Recursively define the value of an optimal solution.
3. By using bottom up technique compute value of optimal solution. For that
you have to develop a recurrence relation that relates a solution to its
subsolutions, using the mathematical notation of step 1.
4. Compute an optimal solution from computed information.
Principle of optimality
The dynamic programming algorithm obtains the solution using principle of
optimally.
The principle of optimally states that “in an optimal sequence of decisions or choice
each subsequence must also be optimal”.
When it is not possible to apply the principle of optimally it is almost
impossible to obtain the solution using the dynamic programming approach.
3. 3.2. Rod cutting
Problem description: given a rod of length n units, and the price of all
pieces small than n, find the most profitable way of cutting the rod.
The remaining 3 ways
To cut are just
permutations of the left
arrangements
Proof of optimal sub structure:
Let’s say we had the optimal solution for cutting the rod ci..j where ci is the
first piece, and cj is the last piece.
If we take one of the cuts from this solution, some where in the middle, say
k, and split it so we have two sub problems, ci..k and ck+1..j (assuming owk
optimal is not just a single piece).
Let’s assume we had a more optimal way of cutting ci..k
We would swap the old ci..k, and replace it with the more optimal ci..k
Overall, the entire problem would now have an even more optimal solution.
But we already had started that we had the optimal solution! This is a
contradiction!
Therefor our original optimal solution is the optimal solution, and this
problem exhibits optimal substructure
Lets define c(i) as the price of the optimal cut of rod up until length i
Let vk be the price of a cut at length k
How to develop a solution:
We define the smallest problems first, and store their solutions
4. We increase the rod length, and try all the cuts for that size of rod, taking
the most profitable one.
We store the optimal solution for this sized piece, and build solutions to
larger pieces form then in some sort of data structure.
C(1) = V1 + C(1-1) 1
C(2) = V1 + C(2-1) = 1+1 = 2
V2 + C(2-2) = 5
C(3) = V1 + C(3-1) = 1+5 = 6
V2 + C(3-2) = 5 +1 =6
V3 + C(3-3) = 8
C(4) = V1 + C(4-1) = 1+8 = 9
V2 + C(4-2) = 5 +5 =10
V3 + C(4-3) = 8 + 1 =9
V3 + C(4-4) = 9
Here we can see the highest value is 10 so the optimal solution
6. Without dynamic programming, the problem has a complexity of O (2n
)!
For a rod of length 8, there are 128(or 2n-1
) ways to cut it!
With dynamic programming, and this top down approach, the problem is
reduced to O (n2
)
3.3. Matrix chain multiplication
Input: n matrices A1, A2, …..An of dimensions P1 × P2, P2 × P3, …Pn × Pn+1
Goal: to compute the matrix product A1,A2…An
Problem: In what order should A1,A2…An be multiplied so that it would take the
minimum number of computations to derive the product.
For performing Matrix multiplication the cost is:
Let A and B be two matrices of dimensions p × q and q × r
Then C = AB. C is of dimensions p × r
Thus Cij takes n scalar multiplications and (n-1) scalar additions.
Consider an example of the best way of multiplying 3 matrices:
Let A1 of dimensions 5 × 4, A2 of dimensions 4 × 6 and A3 of dimensions
6 × 2
To solve this problem using dynamic programming method we will perform
following steps.
Let Mij denote the cost of multiplying Ai…Aj, where the cost is measured in the
number of scalar multiplications.
Here , M(i,j) = 0 for all I, and M(1,n) is required solution.
The sequence of decisions can be build using the principle of optimality.
Consider the process of matrix chain multiplication.
Let T be the tree corresponding to the optimal way of multiplying Ai…Aj.
T has a left subtree L and a right subtree R. L corresponds to multiplying Ai..Aj and
R to multiplying Ak+!...Aj for some integer k such that (I <= k <= j-1)
Thus we get optimal sub chairs of matrix and then the multiplication is performed.
This ultimately proves that thematrix chain multiplication follows the principle of
optimally.
7. We will apply following formula for computing each sequence.
Mij = min [Mik + Mk+1,j + PiPk+1Pj+1 | I <= k <= j-1]
Example : consider A1 = 5 × 4, A2 = 4 × 6, A3 = 6 × 2, A4 = 2 × 7
P1 = 5, P2 = 4, P3 = 6, P4 = 2, P5 = 7
The computations are given as below
Let , Now i=1, j = 2, k = 1. We will compute Mij
Mij = min [Mik + Mk+1,j + PiPk+1Pj+1 | I <= k <= j-1]
M12 = M11 + M22 + P1 P2 P3 = 0 + 0 + 5 * 4 * 6 = 120
Now i=2, j = 3, k = 2.
M23 = M22 + M33 + P2 P3 P4 = 0 + 0 + 4 * 6 * 2 = 48
Now i = 3, j = 4, k = 3
M34 = M33 + M44 + P3 P4 P5 = 0 + 0 + 6 * 2 * 7 = 84
Now i = 1, j = 3, k = 1 or k=2 . we will consider k = 1 then ,
M13 = M11 + M23 + P1 P2 P4 = 0 + 48 + 5 * 4 * 2 = 88
When k = 2 we get M13 as
M13 = M12 + M33 + P1 P3 P4 = 120 + 0 + 60 = 180
As we get mininmum value of M13 when k = 1. We will set k = 1 and M13 = 88
Now for i = 2, j = 4, k = 3
8. M24 = M23 + M44 + P2 P4 P5 = 48+ 0 + 4 * 2 *7 = 104
Now i = 1, j = 4, k = 3
M14 = M13 + M44 + P1 P4 P5 = 88+ 0 + 5 * 2 *7 = 158
Thus we get optimum cost as 158. To seek the optimum sequence for this cost we
well trace the algorithm Mul which is discussed. For this cost 158 which is optimal
the optimal sequence is A1 × (A2× A3) × A4
For instance :
A1 = 5 × 4, A2 = 4 × 6, A3 = 6 × 2, A4 = 2 × 7
Hence
A1 × (A2× A3) × A4
= (( 5 ×4 ×2) + (4 ×6 ×2 )) + 2 ×7 ×5
= 40 + 48 + 70
= 88 + 70
= 158
Hence optimal solution is A1(A2A3)A4
Algorithm:
The Algorithm
1: n ← length[p] − 1
2: for i ← 1 to n do
3: m[i, i] ← 0
4: end for
5: for ` ← 2 to n do
6: for i ← 1 to n − ` + 1 do
7: j ← i + ` − 1
8: m[i, j] ← ∞
9: for k ← i to j − 1 do
10: q ← m[i, k] + m[k + 1, j] + pi−1 · pk
· pj
11: if q < m[i, j] then
12: m[i, j] ← q
13: s[i, j] ← k
14: end if
15: end for
16: end for
17: end for
9. 3.4. Elements of Dynamic Programming
We have done an example of dynamic programming: the matrix chain
multiply problem, but what can be said, in general, to guide us to choosing DP?
Optimal Substructure
Overlapping Sub-problems
Variant: Memoization
Optimal Substructure: OS holds if optimal solution contains within it optimal
solutions to sub problems. In matrix-chain multiplication optimally doing A1, A2,
A3, ...,An required A 1...k and A k+1 ...n to be optimal. It is often easy to show the
optimal sub problem property as follows:
Split problem into sub-problems
Sub-problems must be optimal; otherwise the optimal splitting would not have
been optimal.
There is usually a suitable "space" of sub-problems. Some spaces are more
"natural" than others.
A general way to investigate optimal substructure of a problem in DP is to
look at optimal sub-, sub-sub, etc. problems for structure. When we noticed that sub
problems of A1, A2, A3, ...,An consisted of sub-chains, it made sense to use sub-
chains of the form Ai, ..., Aj as the "natural" space for sub-problems.
Overlapping Sub-problems: Space of sub-problems must be small: recursive
solution re-solves the same sub-problem many times. Usually there are
polynomially many sub-problems, and we revisit the same ones over and over
again: overlapping sub-problems.
Memoization: What if we stored sub-problems and used the stored solutions in a
recursive algorithm? This is like divide-and-conquer, top down, but should benefit
like DP which is bottom-up. Memoized version maintains an entry in a table. One
can use a fixed table or a hash table.
3.5. Longest Common Subsequence (LCS)
Application: comparison of two DNA strings
Ex: X= {A B C B D A B }, Y= {B D C A B A}
Longest Common Subsequence:
X = A B C B D A B
Y = B D C A B A
10. Brute force algorithm would compare each subsequence of X with the symbols in Y
LCS Algorithm
if |X| = m, |Y| = n, then there are 2m subsequences of x; we must compare
each with Y (n comparisons)
So the running time of the brute-force algorithm is O(n 2m)
Notice that the LCS problem has optimal substructure: solutions of
subproblems are parts of the final solution.
Subproblems: “find LCS of pairs of prefixes of X and Y”
First we’ll find the length of LCS. Later we’ll modify the algorithm to find
LCS itself.
Define Xi, Y j to be the prefixes of X and Y of length i and j respectively
Define c[i,j] to be the length of LCS of Xi and Yj
Then the length of LCS of X and Y will be c[m,n]
LCS recursive solution
We start with i = j = 0 (empty substrings of x and y)
Since X0 and Y0 are empty strings, their LCS is always empty (i.e. c[0,0] =
0)
LCS of empty string and any other string is empty, so for every i and j: c[0,
j] = c[i,0] = 0
When we calculate c[i,j], we consider two cases:
First case: x[i]=y[j]: one more symbol in strings X and Y matches, so the
length of LCS
Xi and Yj equals to the length of LCS of smaller strings Xi-1 and Yi-1 ,
plus 1
Second case: x[i] != y[j]
As symbols don’t match, our solution is not improved, and the length of
LCS(Xi , Yj) is the same as before (i.e. maximum of LCS(Xi, Yj-1) and
LCS(Xi-1,Yj)
Why not just take the length of LCS(Xi-1, Yj-1) ?
11. LCS Length Algorithm
LCS-Length(X, Y)
1. m = length(X) // get the # of symbols in X
2. n = length(Y) // get the # of symbols in Y
3. for i = 1 to m c[i,0] = 0 // special case: Y0
4. for j = 1 to n c[0,j] = 0 // special case: X0
5. for i = 1 to m // for all Xi
6. for j = 1 to n // for all Yj
7. if ( Xi== Yj)
8. c[i,j] = c[i-1,j-1] + 1
9. else c[i,j] = max( c[i-1,j], c[i,j-1] )
10. return c
LCS Example
We’ll see how LCS algorithm works on the following example:
■ X = ABCB
■ Y = BDCAB
LCS(X, Y) = BCB
X = A B C B
Y = B D C A B
What is the Longes Common Subsequence of X and Y?
LCS(X, Y) = BCB
X = A B C B
Y = B D C A B
12.
13.
14.
15.
16. 3.6. Optimal Binary Search Trees
A binary tree T is a binary tree, either it is empty or each node in the tree
contains an identifier and,
1. All identifiers in the left subtree of T are less than (numerically or
alphabetically) the identifier in the root node T.
2. All identifiers in the right subtree are greater than the identifier in the root
node T.
3. The left and right subtree of T are also binary search trees.
If want to search an element in binary search tree, first that element is compared
with root node. If element is less than root node then search continue in left subtree.
If element is greater than root node then search continue in right subtree. If element
is equal to root node then print the successful search (element found) and terminate
search procedure.
The principle of OBST is
c(i, j) = min {c(i,k-1)+c(k,j)}+w(i,j)}
i<k≤j
Problem
Using algorithm OBST compute w(i,i), r(i,i) and c(i,i), 0≤i≤i≤4 for the identifier
set (a1,a2,a3,a4) = (end, goto, print, stop) with p(1) = 3, p(2) = 3, p(3) = 1, p(4)
= 1, q(0) = 2, q(1) = 3, q(2) = 1, q(3) = 1, q(4) = 1. Using r(i,i) construct the
optimal binary search tree.
Solution
Initially, c(i,i) = 0, r(i,i) = 0, 0≤i≤4.
w(i,i) = q(i),
w(0, 0) = q(0) = 2,
w(1, 1) = q(1) = 3,
w(2, 2) = q(2) = 1,
w(3, 3) = q(3) = 1,
w(4, 4) = q(4) = 1.
Now, c(i, j) = min {c(i,k-1)+c(k,j)}+w(i,j)
i<k≤j
w(i, j) = p(j) + q(j) + w(i, j-1)
rij = k, i<k≤j (k is chosen such that where cost is minimum)
19. = 1 + 1+ 14
= 16
c(0, 4) = min {c(0,0) + c(1,4),c(0, 1)+c(2, 4), c(0,2) + c(3,4),c(0, 3)
+ c(4, 4) + w(0, 4)}
0<k≤4
= min {19, 16, 22, 25} + 16
= 16 + 16
= 32
r(0, 4) = 2
Table Computation c(0,4), w(0,4) and r(0,4)
To build OBST, r(0, 4) = 2 k = 2⟹
Hence a2, becomes root node.
Let T be OBST, Ti, j = Ti, k-1, Tk, j
T0,4 is divided into two parts T0, 1 and T2, 4.
20. Fig. T0, 4 is divided into Two Parts T0, 1 and T2, 4
T0, 1 = r(0, 1) = 1 k = 1⟹
T2, 4 = r(2, 4) = 3 k = 3⟹
T0, 1 is divided into two parts as T0, 0 and T1, 1 (∵ k = 1)
Again T3, 4 is divided into T3, 3 and T4, 4 (∵ k = 1)
Since r00, r11, r22, r33, r44 is 0, these external nodes and can be neglected.
Fig. Optimal Binary Search Tree
The cost of optimal binary search tree is 32 and the root is a2.