Welcome to the comprehensive study material designed specifically for students of Rajiv Gandhi Proudyogiki Vishwavidyalaya (RGPV) in their 4th semester of the Computer Science Engineering program. In this study material, we will explore the essential concepts of "Analysis & Design of Algorithms," a crucial subject that forms the foundation of efficient problem-solving in computer science and engineering.
Unit 1: Greedy Strategy and Examples
This unit introduces the Greedy strategy, a powerful approach in algorithm design. Greedy algorithms make locally optimal choices at each step, aiming to achieve a globally optimal solution. We will explore its applications through various examples:
Optimal Merge Patterns: Learn how to merge sorted sequences efficiently using the Greedy approach, minimizing the number of comparisons needed.
Huffman Coding: Dive into the world of data compression with Huffman coding, assigning shorter binary codes to frequently occurring characters.
Minimum Spanning Trees (MST): Understand Prim's and Kruskal's algorithms to find the minimum weight spanning tree in a graph, crucial in network design and clustering.
Knapsack Problem: Discover the limitations of the Greedy strategy when solving the Knapsack problem and explore more efficient techniques.
Job Sequencing with Deadlines: Apply the Greedy approach to schedule jobs optimally based on deadlines and profits.
Single Source Shortest Path Algorithm: Master Dijkstra's algorithm to find the shortest paths from a single source vertex to all other vertices in a weighted graph.
Each topic is accompanied by detailed explanations, examples, and step-by-step algorithms to enhance your understanding and problem-solving skills.
About the Author
The study material is meticulously prepared by expert educators with vast experience in computer science and engineering. They have a deep understanding of the RGPV curriculum and the specific needs of 4th-semester students pursuing Computer Science Engineering.
How to Use this Study Material
This study material is designed to complement your regular coursework. Use it as a valuable resource for exam preparation, revision, and understanding complex concepts. Pay attention to algorithmic explanations, proof of correctness, and time complexity analysis.
Why This Study Material?
RGPV-Aligned Content: The material is tailored to align with the RGPV curriculum, ensuring relevance and applicability.
Comprehensive Coverage: All critical topics are covered in detail, providing a holistic understanding of the subject.
Clarity and Simplicity: Complex algorithms and concepts are explained in a clear and straightforward manner, making it accessible to all students.
Real-World Applications: Explore how algorithms are applied to solve real-world problems, enhancing your problem-solving skills.
Practice Questions: Practice questions and exercises are included to reinforce your learning.
2. R
G
P
V
द
े
B
u
n
k
e
r
s
ADA Unit — 2: Greedy Strategy and
Examples
1. Introduction to Greedy Strategy
The Greedy strategy is a powerful approach used in algorithm design to solve various
optimization problems. It belongs to the class of algorithms known as "constructive heuristics,"
where decisions are made at each step to optimize a certain objective function. The greedy
strategy builds solutions piece by piece by always making locally optimal choices, hoping that
these choices will lead to the globally optimal solution. In simpler terms, at each step, the
greedy algorithm selects the best available option without considering the consequences of that
decision on future steps.
The greedy strategy is particularly useful when the problem exhibits the "Greedy Choice
Property," which means that a globally optimal solution can be reached by making locally
optimal choices. However, it's crucial to understand that not all optimization problems can be
efficiently solved using the greedy approach, as it doesn't guarantee a globally optimal solution
for every problem.
2. Optimal Merge Patterns
2.1 Definition
The "Optimal Merge Patterns" problem is a classic example of applying the greedy strategy to
efficiently merge multiple sorted sequences. Given 'n' sorted sequences, each containing a
certain number of records, the goal is to merge these sequences into a single sorted sequence
with the minimum number of comparisons.
2.2 Greedy Algorithm
To solve the Optimal Merge Patterns problem, we can use a priority queue (min-heap) to
efficiently merge the sorted sequences. The steps for the greedy algorithm are as follows:
3. R
G
P
V
द
े
B
u
n
k
e
r
s
1. Create a min-heap and insert all 'n' sequences into the heap. The heap will store the
minimum element from each sequence, and the root of the heap will always contain the smallest
element among all sequences.
2. While the heap contains more than one sequence:
● Extract the two sequences with the smallest elements from the heap. These sequences
will be the first two sequences to merge.
● Merge the two sequences into a new sorted sequence using a merging algorithm like the
"Merge Sort" merge step.
● Insert the merged sequence back into the heap.
3. The last remaining sequence in the heap is the final output, representing the merged and
sorted sequence of all records.
2.3 Example
Let's consider three sorted sequences as an example:
1. Sequence 1: [2, 4, 6, 8]
2. Sequence 2: [1, 3, 5, 7]
3. Sequence 3: [0, 9, 10, 11]
Using the greedy algorithm to merge these sequences, the steps would be as follows:
Initial Heap: [0, 1, 2]
1. Extract sequences [0, 1] and merge them into [0, 1].
Updated Heap: [2, 2, 4, 5, 6, 7, 8, 9, 10, 11]
2. Extract sequences [2, 2] and merge them into [2, 2].
Updated Heap: [4, 4, 5, 6, 7, 8, 9, 10, 11]
3. Extract sequences [4, 4] and merge them into [4, 4].
4. R
G
P
V
द
े
B
u
n
k
e
r
s
Updated Heap: [5, 5, 6, 7, 8, 9, 10, 11]
4. Extract sequences [5, 5] and merge them into [5, 5].
Updated Heap: [6, 6, 7, 8, 9, 10, 11]
5. Extract sequences [6, 6] and merge them into [6, 6].
Updated Heap: [7, 7, 8, 9, 10, 11]
6. Extract sequences [7, 7] and merge them into [7, 7].
Updated Heap: [8, 8, 9, 10, 11]
7. Extract sequences [8, 8] and merge them into [8, 8].
Updated Heap: [9, 9, 10, 11]
8. Extract sequences [9, 9] and merge them into [9, 9].
Updated Heap: [10, 10, 11]
9. Extract sequences [10, 10] and merge them into [10, 10].
Updated Heap: [11, 11]
10. Extract sequences [11, 11] and merge them into [11, 11].
Final Merged Sequence: [0, 1, 2, 2, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9, 10, 10, 11, 11]
Thus, using the greedy strategy, we have efficiently merged the three sorted sequences into a
single sorted sequence with the minimum number of comparisons.
5. R
G
P
V
द
े
B
u
n
k
e
r
s
3. Huffman Coding
3.1 Definition
Huffman coding is a lossless data compression algorithm used to compress data efficiently. It is
based on the concept of variable-length prefix codes, where different characters are
represented by codes of different lengths. The most frequently occurring characters are
assigned shorter codes, while less frequent characters are assigned longer codes.
3.2 Greedy Algorithm
The steps to construct a Huffman tree and encode the data using the Huffman coding algorithm
are as follows:
1. Calculate the frequency of each character in the input data.
2. Create a min-heap (priority queue) of nodes, where each node represents a character and its
frequency. Initially, each character is considered as a single-node binary tree.
3. While there is more than one node in the heap:
● Extract the two nodes with the lowest frequencies from the heap. These nodes will
become the left and right children of a new internal node.
● Create a new internal node with a frequency equal to the sum of the frequencies of its
children.
● Insert the new internal node back into the heap.
4. The root of the heap represents the root of the Huffman tree.
5. Traverse the Huffman tree from the root to each leaf node while assigning '0' for left edges
and '1' for right edges.
6. The binary codes obtained during the tree traversal represent the Huffman encoding of each
character.
7. Replace each character in the input data with its corresponding Huffman code to compress
the data.
6. R
G
P
V
द
े
B
u
n
k
e
r
s
3.3 Example
Let's consider a simple example to illustrate Huffman coding:
Input Data: "ABBCCCDDDDEEEEE"
Step 1: Calculate character frequencies:
● A: 1 occurrence
● B: 2 occurrences
● C: 3 occurrences
● D: 4 occurrences
● E: 5 occurrences
Step 2: Create the initial min-heap of nodes:
Nodes: (A:1), (B:2), (C:3), (D:4), (E:5)
Continuing from Step 2:
Nodes: (A:1), (B:2), (C:3), (D:4), (E:5)
Step 3: Construct the Huffman tree using the greedy algorithm:
1. Extract (A:1) and (B:2) and create an internal node with frequency 3:
Nodes: (C:3), (D:4), (E:5), (Internal:3)
2. Extract (C:3) and (D:4) and create an internal node with frequency 7:
Nodes: (E:5), (Internal:3), (Internal:7)
3. Extract (E:5) and (Internal:3) and create an internal node with frequency 8:
Nodes: (Internal:7), (Internal:8)
7. R
G
P
V
द
े
B
u
n
k
e
r
s
4. Extract (Internal:7) and (Internal:8) and create the root node with frequency 15:
Nodes: (Root:15)
The resulting Huffman tree:
Step 5: Traverse the Huffman tree and assign binary codes to each character:
● A: 110
● B: 111
● C: 0
● D: 10
● E: 11
Step 6: Replace each character in the input data with its corresponding Huffman code:
Input Data: "ABBCCCDDDDEEEEE"
Huffman Encoded Data: "11111010101110111011101111011110"
8. R
G
P
V
द
े
B
u
n
k
e
r
s
The original data is compressed using Huffman coding, resulting in a shorter binary
representation. The compression is achieved because the frequently occurring characters are
assigned shorter codes, while less frequent characters are assigned longer codes.
4. Minimum Spanning Trees (MST)
4.1 Definition
The Minimum Spanning Tree (MST) problem is a classic optimization problem in graph theory.
Given a connected, undirected graph with edge weights, the goal is to find the tree that spans
all vertices with the minimum possible total edge weight. In other words, an MST is a subgraph
that connects all vertices without forming any cycles and has the minimum sum of edge weights
among all possible spanning trees.
4.2 Greedy Algorithms for MST
Two popular greedy algorithms to find the MST of a graph are Prim's algorithm and Kruskal's
algorithm.
4.2.1 Prim's Algorithm
Prim's algorithm starts with an arbitrary vertex and repeatedly adds the minimum-weight edge
that connects a vertex in the current MST to a vertex outside the MST until all vertices are
included.
The steps of Prim's algorithm are as follows:
1. Initialize an empty MST and a set to keep track of vertices included in the MST.
2. Choose an arbitrary vertex as the starting point and add it to the MST set.
3. While the MST set does not include all vertices:
a. Find the minimum-weight edge that connects a vertex in the MST set to a vertex
outside the MST set.
b. Add the vertex at the other end of the selected edge to the MST set and add the
edge to the MST.
4. The MST is complete when all vertices are included.
9. R
G
P
V
द
े
B
u
n
k
e
r
s
4.2.2 Kruskal's Algorithm
Kruskal's algorithm starts with each vertex forming a separate component and repeatedly adds
the minimum-weight edge that doesn't form a cycle with the edges already included in the MST
until all vertices are connected.
The steps of Kruskal's algorithm are as follows:
1. Create a forest of single-vertex trees, where each vertex is a separate component.
2. Sort all edges in non-decreasing order of their weights.
3. Iterate through the sorted edges and add each edge to the MST if it doesn't form a cycle
with the edges already included in the MST.
4. The MST is complete when all vertices are connected.
4.3 Example
Let's consider the following undirected graph with its edge weights:
10. R
G
P
V
द
े
B
u
n
k
e
r
s
Using Prim's algorithm to find the MST, we start with vertex A:
Step 1: A (Starting Point)
Step 2: A - B (Weight: 5)
Step 3: A - C (Weight: 3)
Step 4: C - D (Weight: 4)
Step 5: D - E (Weight: 6)
The MST is complete, and the total weight is 3 + 4 + 5 + 6 = 18.
Using Kruskal's algorithm to find the MST, we sort the edges in non-decreasing order:
Edges in non-decreasing order: (C, A, 3), (D, C, 4), (A, B, 5), (B, D, 2), (D, E, 6)
Step 1: Add (C, A, 3)
Step 2: Add (D, C, 4)
Step 3: Add (A, B, 5)
Step 4: Add (B, D, 2)
Step 5: Add (D, E, 6)
The MST is complete, and the total weight is 3 + 4 + 5 + 2 + 6 = 20.
Both Prim's and Kruskal's algorithms give valid MSTs, but they might differ in the edge weights'
total sum due to the different edge selection approaches.
11. R
G
P
V
द
े
B
u
n
k
e
r
s
5. Knapsack Problem
5.1 Definition
The Knapsack problem is a classic optimization problem that deals with a knapsack with a fixed
capacity and a set of items, each having a weight and a value. The goal is to determine the
most valuable combination of items that can fit into the knapsack without exceeding its capacity.
5.2 Greedy Approach
The Knapsack problem cannot be efficiently solved using a greedy strategy in its classical form.
The greedy approach fails to guarantee an optimal solution because selecting items solely
based on their value-to-weight ratio may not lead to the best overall value.
To understand why the greedy approach fails, consider a situation where the knapsack has a
capacity of 10 units, and the available items are:
If we apply the greedy approach and select items based on their value-to-weight ratio, the
greedy algorithm would pick Item 2 (Value-to-Weight Ratio = 2.4) first since it has the highest
ratio. However, if we add Item 2 to the knapsack, there will be only 5 units of capacity remaining,
and we won't be able to fit Item 1 or Item 3, even though they have higher total values than Item
2. In this case, the greedy approach fails to find the optimal solution.
The Knapsack problem belongs to a class of problems known as "NP-Hard," meaning there is
no known polynomial-time algorithm to find the exact optimal solution for large problem
instances. To solve the Knapsack problem efficiently, we typically use dynamic programming
techniques, such as the "0/1 Knapsack Problem" or the "Fractional Knapsack Problem."
12. R
G
P
V
द
े
B
u
n
k
e
r
s
6. Job Sequencing with Deadlines
6.1 Definition
The "Job Sequencing with Deadlines" problem is another classic optimization problem that
deals with a set of 'n' jobs, each with a deadline and a profit. The objective is to schedule the
jobs in a way that maximizes the total profit while meeting the given deadlines. Each job takes a
single unit of time to complete, and only one job can be scheduled at a time.
6.2 Greedy Algorithm
The greedy algorithm for the Job Sequencing with Deadlines problem involves the following
steps:
1. Sort the jobs in non-increasing order of their profits.
2. Initialize an array called 'slots' to keep track of the allocated time slots. Initially, all
elements in the 'slots' array are set to -1 to indicate that no job is scheduled.
3. For each job, starting from the job with the highest profit:
a. Find the latest available time slot before its deadline. This can be done by
checking the 'slots' array from the deadline to the first time slot (i.e., slot 0).
b. If a time slot is available (i.e., 'slots[deadline]' is unoccupied), assign the job to
that time slot and update the 'slots' array accordingly.
c. If no slot is available, skip the job and move on to the next one.
4. The 'slots' array now contains the optimal job schedule, and the total profit obtained can
be calculated.
6.3 Example
Let's consider the following set of jobs with their respective deadlines and profits:
13. R
G
P
V
द
े
B
u
n
k
e
r
s
Using the greedy algorithm, we sort the jobs in non-increasing order of profits:
Now, we proceed with the job scheduling:
● Job 2 (Deadline = 1, Profit = 100) is scheduled at time slot 1.
● Job 1 (Deadline = 2, Profit = 60) is scheduled at time slot 2.
● Job 4 (Deadline = 1, Profit = 40) cannot be scheduled since time slot 1 is already
occupied.
● Job 3 (Deadline = 2, Profit = 20) is scheduled at time slot 1, replacing Job 2.
The final job schedule is as follows:
● Time Slot 1: Job 3 (Profit = 20)
● Time Slot 2: Job 1 (Profit = 60)
Total Profit: 20 + 60 = 80
14. R
G
P
V
द
े
B
u
n
k
e
r
s
The greedy algorithm successfully found the optimal job schedule with the maximum total profit
while meeting all the deadlines.
7. Single Source Shortest Path Algorithm
7.1 Definition
The Single Source Shortest Path (SSSP) problem aims to find the shortest paths from a single
source vertex to all other vertices in a weighted graph. The "shortest path" is defined as the path
with the minimum sum of edge weights between the source vertex and each destination vertex.
7.2 Greedy Algorithm - Dijkstra's Algorithm
Dijkstra's algorithm is a widely used greedy algorithm to solve the SSSP problem for graphs with
non-negative edge weights. The algorithm works as follows:
1. Initialize a distance array and set the distance of the source vertex to 0 and all other
vertices to infinity. The distance array will be used to keep track of the minimum distance
from the source vertex to each vertex.
2. Create a priority queue (min-heap) to keep track of the next vertex to explore. Initially,
the source vertex is inserted into the priority queue.
3. While the priority queue is not empty:
a. Extract the vertex with the minimum distance from the queue. This vertex is the
one with the shortest path discovered so far.
b. Relax all adjacent edges of the extracted vertex. Relaxation means updating the
distance of adjacent vertices if a shorter path is found through the current vertex.
c. After the algorithm completes, the distance array will contain the shortest
distances from the source vertex to all other vertices in the graph.
Dijkstra's algorithm efficiently finds the shortest paths in a graph with non-negative edge
weights, making it suitable for various real-world applications, such as routing algorithms and
GPS navigation systems.
15. R
G
P
V
द
े
B
u
n
k
e
r
s
7.3 Example
Let's consider the following weighted graph:
To find the shortest paths from vertex A to all other vertices using Dijkstra's algorithm, we
proceed with the following steps:
1. Initialize the distance array: Distance[A] = 0, Distance[B] = ∞, Distance[C] = ∞,
Distance[D] = ∞.
2. Start with vertex A and insert it into the priority queue.
3. Extract vertex A (Distance[A] = 0):
a. Relax edge (A, B) with weight 4: Distance[B] = min(Distance[B], Distance[A] + 4)
= min(∞, 0 + 4) = 4.
b. Relax edge (A, C) with weight 2: Distance[C] = min(Distance[C], Distance[A] + 2)
= min(∞, 0 + 2) = 2.
4. Extract vertex C (Distance[C] = 2):
a. Relax edge (C, D) with weight 3: Distance[D] = min(Distance[D], Distance[C] + 3)
= min(∞, 2 + 3) = 5.
5. Extract vertex B (Distance[B] = 4):
16. R
G
P
V
द
े
B
u
n
k
e
r
s
a. Relax edge (B, D) with weight 5: Distance[D] = min(Distance[D], Distance[B] + 5)
= min(5, 4 + 5) = 5.
6. Extract vertex D (Distance[D] = 5):
a. No adjacent edges to relax, so the process for vertex D is complete.
The final distance array after completing Dijkstra's algorithm is:
This indicates that the shortest distances from vertex A to all other vertices are:
● A to A: 0 (source vertex itself)
● A to B: 4
● A to C: 2
● A to D: 5
The shortest path from A to each vertex in the graph is determined by the minimum distance
value obtained by Dijkstra's algorithm.
Conclusion
In this unit, we have explored the "Greedy Strategy" and its applications in various algorithms.
We started with the "Optimal Merge Patterns" problem, which involved merging multiple sorted
sequences efficiently using a greedy approach. We then delved into the "Huffman Coding"
algorithm, which is used for data compression by assigning shorter binary codes to more
frequent characters. Next, we examined the "Minimum Spanning Trees" problem and the greedy
algorithms, Prim's and Kruskal's, used to find the minimum weight spanning tree in a graph.
After that, we discussed the "Knapsack Problem" and its limitation with the greedy strategy, as it
requires more sophisticated techniques like dynamic programming for an optimal solution.
17. R
G
P
V
द
े
B
u
n
k
e
r
s
Lastly, we covered the "Job Sequencing with Deadlines" problem, where the greedy approach
can be successfully applied to find an optimal job schedule.
Each topic presented here can be further explored in detail, and algorithms can be analyzed
more thoroughly in terms of time complexity, space complexity, and edge cases. Algorithm
design and analysis play a significant role in computer science and engineering disciplines, and
understanding these concepts is essential for developing efficient and effective solutions to
real-world problems.
In the next unit, we will continue exploring other important topics related to the "Analysis &
Design of Algorithms" to broaden our understanding and problem-solving skills.
Note: The document provides a detailed explanation of the topics. Each topic can be further
expanded with more examples, proofs, and complexities. If you need additional details or any
specific aspects emphasized, please let us know, and We'll be glad to expand the content
accordingly.