MapReduce Scheduling Algorithms

Hadoop MapReduce
Scheduling Algorithms
Presented By,
Leila Panahi
Fatemeh Sheykh Mohammadi
Under the Guidance of
Dr. Leila Safari
December 2016

Agenda
Introduction Evaluation
Algorithms Summary
2/30

Introduction
 Job scheduling in multi-user environments -> challenge in MapReduce
 Each node is a physical machine with computational and storage capabilities
 Hadoop uses the number of slots concept for each node in order to control the
maximum number of tasks that can be executed concurrently on a node.
 Each slot of the node at any time is only capable of executing one task
 Two types of slot : map slot, and reduce slot.
3/30

Scheduling Algorithms
• Locality-Aware
• Replica-Aware
(Maestro)
• Center-of-Gravity
(CoGRS)
• Context-Aware
(CASH)
• LATE
• SAMR
• ESAMR
• LA
• HAT
4/30
FIFO
• LSCHED
• RAS
• SARS
• COSSH
• Load-
Driven
• Job Aware

Quality Metrics for MapReduce scheduling algorithms
5/30
Quality
Metrics
Fairness
Throughput
Response
Time
Availability
Energy
Efficiency
Resource
Util.
Scalability
Overheads

FIFO
 Default
 First in – First out
 The main objective:
to schedule jobs based on their priorities in first-come first-out of first serve
order
 limitations:
poor response times for short jobs compared to large jobs,
 Low performance when run multiple types of jobs
it give good result only for single type of job
6/30

Fair Scheduling
 all jobs get, on average, an equal share of
resources over time
 The objective:
is to do a equal distribution of compute resources
among the users/jobs in the system
 Covers some limitation of FIFO:
it can works well in both small and large clusters
and less complex.
disadvantage :
does not consider the job weight of each node
7/30

Capacity scheduler
 similar to fair scheduling But used of queues instead of pool
 Queues and sub-queues
 Capacity Guarantee with elasticity
 ACLs for security
 Runtime changes/draining apps
 Resource based scheduling
8/30

Speculation Execution
 Identify slow tasks
The job progress in Hadoop >> 𝑝𝑠 =
𝑀
𝑁
𝑓𝑜𝑟 𝑀𝑎𝑝 𝑇𝑎𝑠𝑘𝑠
1
3
× 𝑘 +
𝑀
𝑁
𝑓𝑜𝑟 𝑅𝑒𝑑𝑢𝑐𝑒 𝑇𝑎𝑠𝑘𝑠
 The average job progress in Hadoop >> 𝑝𝑠 𝑎𝑣𝑔 = 𝑖=1
𝐾 𝑃𝑠[𝑖]
𝐾
 Jobs need to backup >> 𝑓𝑜𝑟 𝑡𝑎𝑠𝑘 𝑇𝑖: 𝑝𝑠 𝑖 < 𝑝𝑠 𝑎𝑣𝑔 − 20%
9/30

Longest Approximate Time to End (LATE)
 scheduler to robustly improve performance by reducing overhead of
speculation execution tasks
 in heterogeneous environment
 find real slow tasks by computing remaining time of all the tasks
 it ranks tasks by estimated time remaining and starts a copy of the highest
ranked task that has a progress rate lower than the Slow Task Threshold
𝑃𝑅 =
𝑃𝑆
𝑇𝑟
(Progress Rate)
 𝑇𝑇𝐸 =
1−𝑃𝑆
𝑃𝑅
(Time To End)
10/30

Longest Approximate Time to End (LATE)
The advantage:
robustness to node heterogeneity, since only some of the slowest speculative
tasks are restarted.
This method does not break the synchronization phase between the map and
reduce phases, but only takes action on appropriate slow tasks.
11/30

Self-Adaptive MapReduce (SAMR)
 Historical information
 nodes
 jobs
12/30
execution time
system resources
Fast : finish a task in a shorter time
Slow : finish a task in a longer time
fast
slow

Self-Adaptive MapReduce (SAMR)
SAMR decreases the time of the execution up to 25% compared with Hadoop’s
scheduler and 14% compared with LATE scheduler.
13/30

Enhanced Self-Adaptive MapReduce (ESAMR)
 SAMR does not consider the fact that size of datasets and type
of jobs may lead to different weights for map and reduce stage.
 classifies the historical information stored on every node into k
clusters using a machine learning technique.
 If a running job has completed some map tasks on a node:
temporary map phase weight (M1) on the node according to the
job’s map tasks completed on the node.
14/30

Enhanced Self-Adaptive MapReduce (ESAMR)
 The temporary M1 weight is used to find the cluster whose M1 weight is the
closest.
Uses the cluster’s stage weights to estimate the job’s map tasks’ TimeToEnd on
the node and identify slow tasks that need to be re-executed.
 Reduce phase : similar procedure.
 After a job has finished, ESAMR calculates the job’s stage weights on every
node and saves these new weighs as a part of the historical information.
 Applies k-means to re-classify the historical information stored on every worker
node into k clusters and saves the updated average stage weights for each of the
k clusters
15/30

Delay
 To address the conflict between locality and fairness
 when a node requests a task,
 if the headof-line job cannot launch a local task
skip it and look at subsequent jobs
 if a job has been skipped long enough
start allowing it to launch non- local tasks, to avoid starvation
 temporarily relaxes fairness to improve locality by asking
jobs to wait for a scheduling opportunity on a node with
local data
16/30

Maestro
 avoid the non-local Map tasks execution problem that relies on replica aware
execution of Map tasks
 keeps track of the chunks and replica locations, along with the number of other
chunks hosted by each node
 efficiently schedule the map task on a data local node which causes minimal
impacts on other nodes local map tasks executions
17/30

Maestro
 It does map task scheduling in two waves:
 initially, it fills the empty slots of each data node based on the number of
hosted map tasks and on the replication scheme for their input data
 second, runtime scheduling takes into account the probability of
scheduling a map task on a given machine depending on the replicas of
the task’s input data
provide a higher locality in the execution of map tasks
more balanced intermediate data distribution for the shuffling phase.
18/30

Context-aware Scheduler
 uses the existing heterogeneity of most clusters and the workload
mix, proposing optimizations for jobs using the same dataset
 The design is based on two key insights:
First, a large percentage of MapReduce jobs are run periodically and
roughly have the same characteristics regarding CPU, network, and
disk requirements
Second, the nodes in a Hadoop cluster become heterogeneous over
time due to failures, when newer nodes replace old ones
19/30

Context-aware Scheduler
 The scheduler uses three steps to
accomplish its objective
classify jobs as CPU or I/O bound
classify nodes as Computational or I/O
map the tasks of a job with different
demands to the nodes that can fulfill the
demands
20/30

Locality-Aware Reduce Task Scheduler
 The Reduce phase scheduling is modified to become aware of
 balance among scheduling delay
 scheduling skew
 system utilization
 parallelism
Partitions
Locations
Size
decrease network traffic
21/30

Center-of-Gravity Reduce Scheduler
 locality-aware
 skew-aware
 The proposed scheduler attempts to schedule every Reduce task at
its center-of-gravity node deter-mined by the network locations
 MapReduce jobs to co-exist on the same system
saving MapReduce
network traffic
Reduce task scheduler
22/30

COSSH
 it considers heterogeneity at both application and cluster levels
 The main approach: use system information to make better scheduling decisions, which leads
to improving the performance.
 two main processes
New job
(user)
• queuing process to store the
incoming job in an appropriate queue
Heartbeat
(free resource)
• triggers the routing process to assign
a job to the current free resource
23/30

self-adaptive scheduling algorithm for reduce start time (SARS)
 optimal reduce scheduling policy for reduce tasks start time
 works by delaying the reduce processes
Shorten the copy duration of the reduce Process
Decrease the task complete time
Save the reduce slots resources
limitation
only focus on
reduce process
24/30

Summary
Scheduling Algorithm Idea to Implementation
FIFO schedule jobs based on their priorities in first-come firstout.
Fair
Scheduling
do a equal distribution of compute resources among the users/jobs in the
system.
Capacity Maximization the resource utilization and throughput in multi-tenant cluster
environment.
hybrid
scheduler
based on
dynamic
Priority
designed for data intensive workloads and tries to maintain data locality during
job execution
LATE Fault Tolerance
25/30

SAMR To improve MapReduce in terms of saving the time of the execution and the
system’s resources.
delay
scheduling
To address the conflict between locality and fairness.
Maestro Proposed for map tasks, to improve the overall performance of the
MapReduce computation.
CREST re-executing a combination of tasks on a group of computing nodes.
context-aware
scheduler
To optimizations for jobs using the same dataset
LARTS decrease network traffic
Summary
26/30

Summary
CoGRS proposed scheduler attempts to schedule every Reduce task at its center-of-
gravity node deter-mined by the network locations.
MaRCO achieve nearly full overlap via the novel idea of including the reduce in the
overlap.
COSHH proposed to improve the mean completion time of jobs
SARS shorten the copy duration of the reduce process, decrease the task complete
time, and save the reduce slots Resources
27/30

Refrences
1. Varma, Rakesh. "Survey on MapReduce and Scheduling Algorithms in Hadoop." International Journal of Science and
Research 4.2 (2015).
2. Zaharia, Matei, et al. "Job scheduling for multi-user mapreduce clusters." EECS Department, University of California,
Berkeley, Tech. Rep. UCB/EECS-2009-55 (2009).
3. Tiwari, Nidhi, et al. "Classification framework of MapReduce scheduling algorithms." ACM Computing Surveys
(CSUR) 47.3 (2015): 49.
4. Zaharia, Matei, et al. "Improving MapReduce Performance in Heterogeneous Environments." OSDI. Vol. 8. No. 4.
2008.
5. Kumar, K. Arun, et al. "CASH: context aware scheduler for Hadoop." Proceedings of the International Conference on
Advances in Computing, Communications and Informatics. ACM, 2012.
6. Hammoud, Mohammad, M. Suhail Rehman, and Majd F. Sakr. "Center-of-gravity reduce task scheduling to lower
mapreduce network traffic." Cloud Computing (CLOUD), 2012 IEEE 5th International Conference on. IEEE, 2012.
7. Rasooli, Aysan, and Douglas G. Down. "COSHH: A classification and optimization based scheduler for heterogeneous
Hadoop systems." Future Generation Computer Systems 36 (2014): 1-15.
8. Lei, Lei, Tianyu Wo, and Chunming Hu. "CREST: Towards fast speculation of straggler tasks in MapReduce." e-
Business Engineering (ICEBE), 2011 IEEE 8th International Conference on. IEEE, 2011.

Refrences
9. Zaharia, Matei, et al. "Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling."
Proceedings of the 5th European conference on Computer systems. ACM, 2010.
10. Sun, Xiaoyu, Chen He, and Ying Lu. "ESAMR: an enhanced self-adaptive MapReduce scheduling algorithm." Parallel
and Distributed Systems (ICPADS), 2012 IEEE 18th International Conference on. IEEE, 2012.
11. Nguyen, Phuong, et al. "A hybrid scheduling algorithm for data intensive workloads in a mapreduce environment."
Proceedings of the 2012 IEEE/ACM Fifth International Conference on Utility and Cloud Computing. IEEE Computer
Society, 2012.
12. Hammoud, Mohammad, and Majd F. Sakr. "Locality-aware reduce task scheduling for MapReduce." Cloud
Computing Technology and Science (CloudCom), 2011 IEEE Third International Conference on. IEEE, 2011.
13. Ibrahim, Shadi, et al. "Maestro: Replica-aware map scheduling for mapreduce." Cluster, Cloud and Grid Computing
(CCGrid), 2012 12th IEEE/ACM International Symposium on. IEEE, 2012.
14. Chen, Quan, et al. "Samr: A self-adaptive mapreduce scheduling algorithm in heterogeneous environment."
Computer and Information Technology (CIT), 2010 IEEE 10th International Conference on. IEEE, 2010.
15. Tang, Zhuo, et al. "A self-adaptive scheduling algorithm for reduce start time." Future Generation Computer Systems
43 (2015): 51-60.

KEEP
CALM
OUR LAST
SLIDE
THIS IS

MapReduce Scheduling Algorithms

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to MapReduce Scheduling Algorithms

Similar to MapReduce Scheduling Algorithms (20)

Recently uploaded

Recently uploaded (20)

MapReduce Scheduling Algorithms