Adaptive job scheduling with load balancing for workflow application

International Journal of Computer Engineering
and Technology (IJCET), 2, Number and – 6367(Print)© IAEME 0976 – 6367(Print),
International Journal of Computer Engineering Technology (IJCET), ISSN
ISSN 0976 – 6375(Online) Volume ISSN 0976
ISSN 0976 – 6375(Online) Volume 2
1, Dec - Jan (2011), IJCET
Number 1, Dec - Jan (2011), pp. 09-21 ©IAEME
© IAEME, http://www.iaeme.com/ijcet.html

ADAPTIVE JOB SCHEDULING WITH LOAD BALANCING
FOR WORKFLOW APPLICATION IN GRID PLATFORM
D.Daniel
PG Scholar
Karunya University
E-Mail: daniel_joen@yahoo.com

Mrs.S.P.Jeno Lovesum M.E
Asst.Professor
Karunya University
E-Mail: jenolovesum@karunya.edu

D.Asir
PG Scholar
Karunya University
E-Mail: asir.info87@gmail.com

A.Catherine Esther Karunya
PG Scholar
Karunya University
E-Mail: catherineesther@karunya.edu.in

ABSTRACT
Grid computing servers as the globally connected systems which performs high
computing in many practical applications. Scheduling plays a key role in providing
performance for grid workflow applications. Various scheduling strategies are proposed,
including static scheduling strategies which map jobs to resources before execution time,
or dynamic alternatives which schedule individual job only when it is ready to execute.
Both of the schedules require significantly high scheduling cost and they may not
produce good quality of schedule with low cost. This paper proposes a novel semi
dynamic algorithm with load balancing concept, which allows the schedule to adapt and
schedule the jobs as per the changes in the dynamic grid environment. The proposed
novel algorithm schedules the job statically and continues the schedule with dynamic
scheduling due to the dynamic nature of the grid. The makespan and the resource usage
are the main to objective of this scheduling algorithm. When the resource and

9

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 2, Number 1, Dec - Jan (2011), © IAEME

performance fluctuation occur in the grid environment it affects the processing of the jobs
which results in the delay in the job completion time. In this algorithm load balancing is
incorporated to handle such situation where the jobs are handled after it is dispatched to
their respective hosts. When there is resource fluctuation occurs due to the dynamic
nature of the grid or over loading of jobs to a processor which delays the makespan, load
balancing is done to handle the job execute and to get desired makespan.
Index Terms: DAG, Tasks, Makespan, Resource usage, Semidynamic scheduling.
I. INTRODUCTION
Grids as geographically distributed computing systems has variety of resources
often dispersed geographically to be interconnected and shared, for scientific and
engineering challenges, in which majority of applications fall into the interdependent
task model. These applications are generally known as workflow applications [4]. Due to
the growing popularity of grid computing systems, many applications have been
attempting to take advantage of these computing environments. Such applications are
generally constructed by interweaving interdependent jobs; these applications are called
workflow applications. Workflow applications are essentially the same as typical parallel
programs, with one exception: a workflow application consists of a set of interdependent
applications (not partitioned tasks of a parallel program). Like conventional parallel
programs, workflow applications can be represented by a DAG. A DAG, G = (V, E),
consists of a set V of v nodes and a set E of e edges. A DAG is also known as a task
graph or macro dataflow graph. The nodes usually represent jobs of a workflow
application, and the edges usually represent precedence constraints. An edge (i,j)ϵE
between job ni and job nj represents the inter job communication. Specifically, the
outputs of job ni must be transmitted to job nj for job nj to start its execution. A job with
no predecessors is called an entry job, nentry; an exit job, nexit , is one that has no
successors. Among the predecessors of a job ni, the predecessor that completes the
communication at the latest time is the most influential parent (MIP) of the job denoted
as MIP(ni). A job is called a ready job if all of its predecessors have been completed. The
longest path of a task graph is the critical path (CP) [1].
Workflow applications can take advantage of a grid computing platform;
however, these applications, besides the use of resource heterogeneity and dynamism,

10


impose a great burden on scheduling. In some systems, this workflow scheduling is left
for manual dispatch by users, while other systems employ automated workflow
management platforms (WMPs)[1] .These WMPs tend to focus on the minimization of
the application’s completion time. However, there are other important performance
considerations of WMPs, such as resource usage, load balancing, and fault tolerance.
Although some WMPs have facilities to deal with these considerations, they often lack
the capability of explicit resource usage control. Rather, for the sake of fault tolerance,
resources are overly used (task duplication).
The job scheduling has a close relationship with the load balancing. There are two
ways load balancing can be made with the given job and resource in the hand, prediction
based and non prediction based. The prediction based load balancing already collects the
amount of jobs it have to schedule against the amount of resources, i.e processors. In this
case the job scheduling is done keeping in account of the availability of the resource and
the load of jobs that is scheduled is scheduled even to all resources based on their status
and the scheduled job is dispatched to the hosts [11]. The non prediction based approach
does not have any information about the resource and the jobs. Based on the schedule
strategy it schedules the job and dispatched to the hosts based on the dynamic changes
that happens among the availability of the resource, the load is migrated and jobs are
executed, in this case dynamic balancing of the load is done.
The rest of the paper is organized as Chapter 2, explains the related works and
Chapter 3 gives a detailed presentation about the adaptive scheduling Chapter 4 gives the
system model, and chapter 5 tells proposed scheduling with load balancing, chapter 6
gives the comparison and evaluation of the proposed scheduling, which is followed by
conclusion and future work on Chapter 7.
II. RELATED WORKS
Since in many respects, workflow scheduling in grids is similar to the
conventional task scheduling problem in tightly coupled heterogeneous computing
systems (e.g., clusters), some well-known task scheduling algorithms (e.g., HEFT) have
been adopted and modified for grid workflow scheduling. Most of the algorithms are
designed in such a way to meet the dynamic nature of the grid. More specifically,
rescheduling and advance reservation among other techniques are often used to deal with

11


uncertainties in resource performance. Most job scheduling approaches adapted from
traditional task scheduling algorithms fall into two category look-ahead category and
just-in-time category. The major difference between these two categories is whether
scheduling decisions are made before the actual job dispatch or at the time any ready jobs
are identified, i.e., their predecessors have all been completed. Clearly, for look-ahead
approaches, the acquisition of accurate performance information on resources plays a
critical role in their decision making [9]. One major drawback of just-in-time approaches
is the loss of timely data transfers. For example, provided that a job has three
predecessors and they complete at different times, the data transfers from these
predecessors to the job start at the time the last predecessor completes its execution.
Where, the times the first two predecessors are completed and the time the last
predecessor is completed is wasted.
The challenge of scheduling grid workflow application with static strategy is
discussed many researches, but few research efforts address them. Rescheduling is
implemented in the GrADS, where it is normally activated by contract violation.
However, the efforts are all conducted for iterative applications, allowing system to
perform rescheduling decisions at each iteration. The plan switching approach is to
construct a family of activity graphs and investigates the means of switching from one
member of the family to another when the execution of one activity graph fails, but the
major drawback is all the plans are made without knowledge about the future
environment change since the grid does not ensures a stable computing environment [5].
Another rescheduling policy is proposed in, which considers rescheduling at a few,
carefully selected points during the execution. The research tackles one of the
shortcomings that static scheduling always assumes accurate prediction of job
performance. After the initial schedule is made, it selectively reschedules some jobs if the
run time performance variance exceeds predefined threshold. However, this approach
deals with only the inaccurate estimation and does not consider the change of resource
pool [10].
Since the majority of the tasks that grid computing handles are interdependent and
most of them are workflow application, the scheduling must concentrate on the resource
usage, to have a well organized use of resource; the scheduler must know the information

12


about the resource, and not just the amount of resource alone. The basic function of the
scheduler needs the amount of resource that is available in the grid to schedule the n
number of jobs. To have a effective schedule the scheduler needs the status of the
resource, its processor speed, how much time it has left before start executing another
job, how much jobs it can handle in the given amount of time,etc. Based on which the
adaptive scheduling strategy is framed , when the schedule does not produce an optimal
performance , or due to the dynamic changes in the availability of the resources the
scheduler adapts to the situation and schedules the job to complete its execution.
III. ADAPTIVE SCHEDULING STRATEGY
Even though static and dynamic scheduling performs near to optimal, its
effectiveness in a dynamic grid environment is questioned. The proposed semidynamic
strategy based novel adaptive job scheduling with load balancing algorithm by which the
workflow scheduler can adapt to the grid dynamics to achieve its strength practically.
A. Issues with Traditional Scheduling
Planning is a onetime activity in the traditional static scheduling. The static
scheduling does not consider the future change of grid environment after the resource
mapping is made. On the other hand, rescheduling in execution phase is proposed but
mainly used to support fault tolerance. Overall, the issues with traditional static
scheduling are: (1) Accuracy of estimation of communication and computation costs, (2)
Adaptation to dynamic environment,

Figure 3.1 Classification of Static scheduling
and (3) Separation of workflow scheduler from executor. Fundamentally the above two
issues are related to the lack of collaboration between the workflow scheduler and
executor. With collaboration, a scheduler will be aware of the grid environment change,
including the job performance variance and resource availability, and is able to

13


adaptively reschedule based on the increasingly accurate estimations. This approach can
both continuously improve performance by considering the new resources and minimize
the impact caused by unexpected resource downgrade or unavailability [9]. The main
issue with dynamic scheduling in workflow application is the execution procedure of the
interdependent tasks. The output of one job could be the input of another job. In the
dynamic scheduling jobs are executed in all possible ways where the resources are taken
into account for scheduling
B. Adaptive Scheduling
The basic idea of adaptive scheduling for a given DAG and a set of currently
available resources, the scheduler makes the initial resource mapping as any other
traditional static approaches do [5]. Along with scheduler gets updation from the executor
about the resources information, such as:
Resource Pool Change: If new resource is discovered after the current plan is
made, rescheduling may reduce the makespan of a DAG by considering the resource
addition. When resource fails, fault tolerant mechanism is triggered and it is taken care of
by Executor. However, if the failure is predictable, rescheduling can minimize the failure
impact on overall performance.
Resource Performance Variance: The performance estimation accuracy is
largely dependent on history data, and inaccurate estimation leads to a bad schedule. If
the run time Performance Monitor can notify the scheduler of any significant
performance variance, the scheduler along with predictor will evaluate its impact and
reschedule if necessary.
The scheduler reacts to event by evaluating if makespan can be reduced by
rescheduling. For example, if a new resource becomes available, the scheduler will
evaluate if a new schedule with the extra resource in consideration can produce smaller
makespan [7]. If so, the scheduler will replace the current one with new one by
submitting it to the Executor.
IV. SYSTEM MODEL
This paper proposes Adaptive Rescheduling approach that can both continuously
improve performance by considering the new resources and minimize the impact caused
by unexpected resource downgrade or unavailability.

14


Grid consists of number of sites. Each site is autonomous in nature; it has its own
users and global users, hosts are time and resource shared. The hosts of same group or
organizations are clustered together and termed to be sites. The resources of the same
cluster or site can access by the hosts of the same site as their own. When the resource
that has to be accessed from another site the complexity arises. There are administrators
allocated for every site. Depending on which they deploy access polices and access rights
and processors allocation etc. These this differs for every organization. The workflow
application has n number of interrelated jobs in one task. The start time of first job of the
task to the finish time of nth job in the task is termed as makespan. These jobs are denoted
by DAGs(Directed Acyclic Graph) each node is job and the edges denote the relation
between jobs. The cluster has load scheduler which makes the load balancer and also
reports status of the resource to the job scheduler which maintains in a history repository
for future job scheduling.

Figure 4.1 System Design
V. ADAPTVE SCHEDULING WITH LOAD BALANCING FOR
WORKFLOW APPLICATIONS
While the task scheduling problem in heterogeneous computing systems with
perfectly accurate performance information on resources and applications still remains
very difficult, uncertainties on resource performance and the lack of control over grid
resources make workflow scheduling even more complex. Unlike many other workflow
scheduling schemes, we consider both makespan and resource usage to be equally
important and take this into account in our scheduling model. Efficient resource usage is
crucial in grid scheduling because 1) a grid consists of multiple sites administered by

15


different entities that use their own resources for other tasks beside the grid jobs and 2)
due to the fluctuations and uncertainty surrounding sites in a grid system, lower resource
usage not necessarily the minimization of the number of resources used, rather the
minimization of resource time means lower overall variance in the expected completion
time (makespan) of an application [1].
A. Job Scheduling
To start with, the schedule is made static with the predefine tasks and its
execution procedures. The tasks are scheduled statically based on their priority, the
priority of each task to be set with upward rank value, ranku, which is based on the mean
computation and communication cost. The task list is created by sorting the tasks in
ascending order of the ranku. Tie-breaking is done randomly. There can be alternative
policies for tie-breaking, such as selecting the task whose immediate successor tasks has
higher upward rank. Since these alternate policies increase time complexity, random
selection strategy is preferred [7]. The task list that is made based on the priority is taken
as S* (current schedule).
At the end of each iteration, mutation is considered if no improvement on S* is
made during the current iteration. Schedule randomly chooses a mutation method
between point and swap mutations and mutates each job in S* with a probability of 0.5
sufficient to generate substantially different schedules. The mutated schedule is then used
as the current schedule (S*). If there have been some improvements on S* in the current
iteration, it is passed onto the next iteration for further improvements. This schedule
manipulation process repeats for a predefined number of iterations. Now, jobs in the
current schedule S* are dispatched to their assigned hosts, as they become ready, i.e.,
their predecessor jobs have finished [1].
B. Cluster Based Dynamic Load Balancing
In the grid computing the resources are globally distributed and the
geographically located resources are termed as clusters. The clusters are group of
processors in an organization or a LAN (Local Area Network). Number of clusters group
together and perform the computation. The user can access any number of resources from
anywhere any time. Each cluster has a scheduler which has the details of the resources or
processor information, about their current computation. Amount of jobs they executed

16


amount of resource they hold .etc. the scheduler also has the details of the neighboring
clusters. The cluster communicates between them to make the load dynamically balance
and to execute the job effectively [11].
When the scheduled jobs are dispatched to the hosts. The next step is the load
balancing. When a job is getting delayed to execute, the Actual Latest Finish time of the
job is calculated. When the delay is less than the ALFT, then the load is balanced. If the
delay is greater than the ALFT, the scheduler communicates with the neighboring cluster
and the load is dynamically allocated to processor that is optimal to execute the
remaining job. Migration is used for allocating the job from one resource to another.
VI. EVALUATION AND COMPARISION
Adaptive scheduling strategy based HEFT-based adaptive rescheduling algorithm
(AHEFT) has the advantage of continuously improves the performance, considering the
new resource and minimizes the impact caused by unexpected resource down grade or
unavailability [7]. Drawbacks of adaptive rescheduling technique are it takes time in
rescheduling the jobs and to implement the collaboration model, rescheduling has to be
integrated with advance resource reservation and resource availability prediction model.
The gridflow gives the Advantage of grid performance service comprises performance
prediction capability with a new application response measurement technique[2], which
can be used to enable prediction-based scheduling as well as response-based scheduling.
But the disadvantage such as the process of a grid workflow encompasses multiple
administrative domains (organizations) [7]. The lack of central ownership and control
results in incomplete information and Computational and networking capabilities can
vary significantly over time in the grid environment. Application performance prediction
becomes difficult and real-time resource information update within a large-scale global
grid becomes impossible, which lower its performance

17


Figure 5.1 The structure of adaptive semidynamic scheduling strategy with Load
Balancing
Critical Path-on-a processor (CPOP) Algorithm and Heterogeneous Earliest First
Time (HEFT) Algorithm, both gives more or less same performance measures as high
performance and fast schedule time. But has slight disadvantage of high schedule cost
[4]. Duplication-based Bottom-Up Scheduling Algorithm (DBUS) gives uses both task
insertion and task duplication. It also gives the facility to minimize the schedule length
[6]. It does not impose any restriction on number of task duplication and Task duplication
mainly causes an increase of resource usage which causes disadvantage to the algorithm.
The ADOS algorithm has the highest possibility for reducing the makespan of the task,
which is the total amount of time required from the start of the first job to the end of the
last job. It also reduces the resource usage. The disadvantage is the algorithm itself is a
complicated which makes iteration more complex in selecting the scheduling best
schedule strategy. The Adaptive scheduling with load balancing strategy provides best
performance on both parameters of the grid work, it reduces the makespan of the jobs and
it makes effective use of resource. The concept provide less complex static and dynamic

18


scheduling strategy and clear condition of the occurrence of load balancing, which makes
the scheduling to have less schedule time.
VII. CONCLUSION
In this paper, the scheduling of workflow applications in grids is addressed.
Unlike many previous scheduling approaches for such a class of applications, the
semidynamic scheduling strategy takes into account both makespan and resource usage.
The schedule achieves the two objectives effectively combining a static heuristic
scheduling scheme with a dynamic scheduling technique with load balancing. Based on
the research and study conducted, the results obtained, the resource-usage-conscious
scheduling scheme significantly improves resource utilization without sacrificing too
much of makespan. The load balancing strategy incorporated into scheduling helps in
ensuring the quality of schedules against performance fluctuations of grid resources.
The future work would be the implementation of the proposed Adaptive
Scheduling with load balancing concept for the workflow application carried on the grid
environment. The result should be taken and compared and a complete performance
evaluation study will be conducted to determine the promising performance of the
schedule on the grid platform.
REFERENCE
1. Young Choon Lee, Member, IEEE, Riky Subrata, and Albert Y. Zomaya, Fellow,
IEEE “On the Performance of a Dual-Objective Optimization Model for
Workflow Applications on Grid Platforms,” Proc, IEEE Transactions On Parallel
And Distributed systems, Vol. 20, N0. 9, September 2009.
2. J. Cao, S.A. Jarvis, S. Saini, and G.R. Nudd, “GridFlow: Workflow Management
for Grid Computing,” Proc. Third IEEE/ACM Int’l Symp. Cluster Computing and
the Grid (CCGrid ’03), pp. 198-205, 2003.
3. H. Casanova, “Simgrid: A Toolkit for the Simulation of Application Scheduling,”
Proc. First IEEE/ACM Int’l Symp. Cluster Computing and the Grid (CCGrid ’01),
pp. 430-437, 2001.
4. R. Wolski, “Dynamically Forecasting Network Performance Using the Network
Weather Service,” Proc. Sixth IEEE Int’l Symp. High Performance Distributed
Computing (HPDC ’97), pp. 316-325, 1997.

19


5. H. Topcuoglu, S. Hariri, and M. Wu, “Performance-Effective and Low-
Complexity Task Scheduling for Heterogeneous Computing,” IEEE Trans.
Parallel and Distributed Systems, vol. 13, no. 3, pp. 260-274, Mar. 2002.
6. D. Bozdag, U. Catalyurek, and F. Ozguner, “A Task Duplication Based Bottom-
Up Scheduling Algorithm for Heterogeneous Environments,” Proc. 19th Int’l
Parallel and Distributed Processing Symp. (IPDPS ’05), Apr. 2005.
7. Z.Yu and W.Shi, “An Adaptive Rescheduling Strategy for Grid Workflow
Applications,” Proc. 21st Int’l Parallel and Distributed Processing Symp.
(IPDPS), 2007.
8. Y. Gil, V. Ratnakar, E. Deelman, G. Mehta, and J. Kim, “Wings for Pegasus:
Creating Large-Scale Scientific Applications Using Semantic Representations of
Computational Workflows,” Proc.19th Conf. Innovative Applications of Artificial
Intelligence (IAAI ’07),pp. 1767-1774, 2007.
9. M. Wieczorek, R. Prodan, and T. Fahringer, “Scheduling of Scientific Workflows
in the ASKALON Grid Environment,” ACMSIGMOD Record, vol. 34, no. 3, pp.
56-62, Sept. 2005.
10. G. Singh, E. Deelman, G. Mehta, K. Vahi, M.-H. Su, G.B. Berriman,J. Good, J.C.
Jacob, D.S. Katz, A. Lazzarini, K. Blackburn, andS. Koranda, “The Pegasus
Portal: Web Based Grid Computing,”Proc. 20th Ann. ACM Symp. Applied
Computing (SAC ’05),pp. 680-686, 2005.
11. A.Mondal, K. Goda, and M. Kitsuregawa, Effective Load Balancing via
Migration and Replication in Spatial Grids, LNCS 2736, pp. 201-211, 2003.
12. B.A. Shirazi, A.R. Hurson, and K.M. Kavi, Scheduling and Load Balancing in
Parallel and Distributed Systems. IEEE CS Press, 1995.
13. A.M. Dobber, G.M. Koole, and R.D. van der Mei, “Dynamic Load Balancing
Experiments in a Grid,” Proc. Fifth IEEE Int’l Symp. Cluster Computing and the
Grid (CCGrid ’05), pp. 123-130, 2005.

20


D.Daniel received the B.E degree in Information Technology from
Karunya University in 2009. He is currently doing his Post graduate in Karunya
University,now works on the project in parallel and distributed Systems, and Continues
research on adaptive scheduling techniques in grid computing.

D. Asir received the B.E degree in Information Technology from
Anna University in 2009. He is currently doing his Post graduate in Karunya
University,now works on the project in parallel and distributed Systems, and Continues
research on Dynamic load balancing techniques in grid computing.

Mrs.S.P.Jeno Lovesum ,Asst professor, has completed her Master of
Engineering (CSE) in Annamalai University, Chidambaram and doing her research in
Cloud Computing.

A.Catherine Esther Karunya received the B.E degree in
Information Technology from Karunya University in 2009. She is currently doing her
Post graduate in Karunya University,now works on the project in Computing Security
and continues her research in Networking.

21

Adaptive job scheduling with load balancing for workflow application

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (8)

Semelhante a Adaptive job scheduling with load balancing for workflow application

Semelhante a Adaptive job scheduling with load balancing for workflow application (20)

Mais de iaemedu

Mais de iaemedu (20)

Adaptive job scheduling with load balancing for workflow application