This document proposes a distributed grid model with four modules: a resource module, worker module, communication module, and user interface module. Each node performs two roles - supervisor and executor. The resource module gathers node information and allocates jobs. The worker module executes jobs and can redirect jobs to less busy nodes for load balancing. Testing showed the proposed model reduced CPU waste compared to a best match algorithm.
2. ICIC Express Letters ICIC International c⃝2011 ISSN 1881-803X
Volume 5, Number 10, October 2011 pp. 3731–3735
DESIGNING AND ANALYZING GRID NODE JOB
PROCESS SCHEDULING
Chih-Ting Tsai1
, Heng-Sheng Chen1
, Jin-Shieh Su2
and Huey-Ming Lee1
1
Department of Management Information
2
Department of Applied Mathematics
Chinese Culture University
No. 55, Hwa-Kung Road, Yang-Ming-San, Taipei 11114, Taiwan
chihting.tsai@gmail.com; { chenhs; sjs; hmlee }@faculty.pccu.edu.tw
Received February 2011; accepted April 2011
Abstract. For the highly dynamic resources in the grid, it is very difficult to man-
age and allocate resources. In this paper, we proposed a grid model for allocating and
invocating resources without centralized information exchanging. With our model, each
grid node could not only manage and allocate resources but also execute jobs. Each node
monitors nodes’ status for dispatching jobs to appropriate nodes, and execute jobs. The
work load of grid nodes would be balanced, and resources wasting ratio would be reduced
with collaborative nodes.
Keywords: Grid computing, Resource allocation, Distributed system
1. Introduction. The term “Grid” was coined in the mid-1990s to denote a proposed
distributed computing infrastructure for advanced science and engineering [1]. Users
could access nodes which are resources of the grid [2,4]. The grid should provide at least
a few functions such as resource allocation, load balance and reliability. There are some
resources in each node such as CPU and memory, which are provided into the grid. The
resources in the grid are changed dynamically. To allocate and invocate these highly
dynamic resources is very important.
The dynamic resources in the grid environment are hard to be allocated and invoked.
Foster et al. [1] presented grid resource allocation and management (GRAM). The GRAM
could simplify the work for addressing resources on other nodes and requesting jobs into
grid environment with a standard interface. Jobs which are requested from users or other
nodes should be properly controlled and delivered. The grid must ensure the job could
be executed and return results to users, otherwise it is meaningless. GRAM [1] provides
series of application programming interfaces (API) dealing with users’ jobs, monitoring
resources on nodes and invoking these resources. Lee et al. [5] proposed a dynamic
supervising model which can utilize the grid resources, e.g., CPU, storages, and it is more
flexible and optimal [6]. Lee et al. [7] developed a model which could let nodes fetch
nodes’ information from supervisor, and users could make job requests on the node with
supervisor role. The supervisor node should receive or collect information from all nodes
in the grid, and invoke resources for executing jobs. This means that the role of supervisor
in the node makes the load in grid nodes be more balanced. Lee et al. [8] also proposed
a process schedule analyzing model which allows nodes when they are heavy loaded to
transfer jobs to other nodes with lighter loaded nodes.
There are several ways to evaluate grid performance. Silberschatz et al. [10] proposed
CPU utilization, throughput, turnaround time, waiting time and response time in job
scheduling. Li et al. [9] proposed the load characterization including system utilization,
job arrival rate and inter arrival time, job cancellation rate, job size, job run time, memory
3731
3. 3732 C.-T. TSAI, H.-S. CHEN, J.-S. SU AND H.-M. LEE
usage and user/group behavior. Xhafa et al. [11] proposed efficient allocating resources
for job with genetic algorithms.
In this paper, we propose that each node performs two roles: supervisor and executer.
The supervisor node would gather information from every node in the grid such as node
name, CPU utilization, memory usage rate, free disk space and job queue length. In the
grid, every node would exchange information mentioned above to each other. If any node
is requested for executing jobs by a user, this node could find appropriate nodes based
on nodes’ information. The role of executer would execute the jobs which are requested
from users or other nodes, and keep execution of jobs correctly. There is a core module
in our model, and saying communication, it will deal any kinds of communication among
nodes in the grid.
2. Framework of the Proposed Model. The node performs two roles: supervisor and
executer. We also need a function to communicate with other nodes. In our work, we
propose modules based distributed grid model (MBDGM). Nodes communicate with each
other for exchanging information and delivering job files.
There are four modules in the proposed model, namely, user interface module, resource
module as role of supervisor, worker module as role of executer, communication module,
as shown in Figure 1.
The functions of these modules are as follows:
• User Interface module (UIM): the functions of this module are displaying grid nodes
status and could accept requests of jobs from user.
• Resource module (RM): it keeps monitoring and managing this node. It is compos-
ited with five components: Initiator, Register, Monitor, Manager and Dispatcher.
This module initiates node’s status and joins this node into grid. It also monitors
nodes’ status in grid and Job Module, and retrieves suitable nodes for executing
jobs.
• Worker Module (WM): this module deals all job execution in node or transfers jobs
to other nodes, it is composited with three components: Job Queue Keeper, Executer
and Redirector. It reports status of Job Queue Keeper, Executer and Redirector.
• Communication Module (CM): this module is used to communicate with other nodes.
2.1. User interface module (UIM). The main function of this module is interacting
between user and grid node, and displays nodes status to user. Users could make job re-
quests with sufficient job parameters to grid, and these parameters would pass to resource
module, as shown in Figure 2.
Figure 1. Framework of module based distributed grid model (MBDGM)
Figure 2. Framework of user interface module
4. ICIC EXPRESS LETTERS, VOL.5, NO.10, 2011 3733
2.2. Resource module (RM). The resource module monitors and manages node with
five components, namely, Initiator, Register, Manager, Monitor and Dispatcher, as shown
in Figure 3. The Initiator component deals with the initiation of local node and hand
over the results of initiation to Register component. The Register component receives
the results from Initiator component and tries making a registration in our grid; if it
works correctly it wakes up the Manager component. The Manager component manages
all modules and components work correctly, and notify information to every node in the
grid. The Monitor component collects only necessary information of node and notifies
Manager component. The node information would be node name, CPU usage, memory
usage, free disk space, job queue length, the information would be stored and exchange
in XML format. The dispatcher component may try to find appropriate nodes for users
jobs.
2.3. Worker module (WM). The worker module just monitors job queue and executes
jobs on local node, as shown in Figure 4. The Job Queue Keeper secures paths of all job
files and execution of jobs in this node, and delivers information of jobs to Executer,
or tries to transfer jobs to other nodes if the job queue is too long. Executer performs
executing all jobs in job queue. Recipient would accept job files transferred from other
nodes and pass on these paths of files to Job Queue Keeper.
2.4. Communication module (CM). This module is used to communicate between
Resource Module and Communication Module on other nodes, as shown in Figure 5. This
module is also used to deliver jobs to other nodes in grid.
3. Model Implementation. For implementing our model as fast as possible, we choose
Visual C# 2008 as developing environment, and Microsoft .Net Framework 3.5 Service
Pack 1. We take advantage from virtualization for examining our model, and choose
VMWare ESXi 4.0.0. On VMWare ESXi we deploy three grid nodes which are connected
by virtual switch, provided by VMWare ESXi. Each node is deployed Microsoft Server
Figure 3. Framework of resource module (RM)
Figure 4. Framework of worker module (WM)
Figure 5. Framework of communication module (CM)
5. 3734 C.-T. TSAI, H.-S. CHEN, J.-S. SU AND H.-M. LEE
2003 SP1, Microsoft .Net Framework 3.5 SP1 and Microsoft Chart extension, as known
MS-Chart used for plot our information.
Referred to [3], we’ve built three nodes scenario. There are three blocks in user interface:
job requesting block, local node real time CPU and memory usage, all nodes’ information,
as shown in Figure 6.
Figure 6. User interface
Users could request a batch-job and results would be logged at some specific place. In
our opinion, all results of jobs should be cared by the developer of jobs not by grid nodes.
The grid should focus on balancing overall loading on nodes but not deciding where should
we output the result of job.
In order to test our model, we make ten requests into single node environment and three
nodes grid environment: 0.7GHz CPU limitation, 1.4GHz CPU limitation and 2.8GHz
CPU limitation. The 2.8GHz CPU limitation is maximum CPU clock in our machine.
There are three kinds of testing job, Level-01 is calculating a = a + 1 for 1 Giga-times,
Level-02 is 2 Giga-times and Level-04 is 4 Giga-times.
Table 1. Best match algorithm CPU waste ratio (%)
Scenarios
Best Match Algorithm CPU Waste Ratio (%)
Level-01 Job Level-02 Job Level-04 Job
Mix Nodes 34.86043 21.68608 42.85714
In Table 1, Mix Nodes means NodeA is 2.8GHz CPU, NodeB is 1.4GHz CPU and
NodeC is 0.7GHz CPU. The best match algorithm is matching jobs as we requested CPU
level, and sending this job to best match node and let the nodes not matched the minimum
requirement from users’ parameters be idle. If users request high level jobs, the low level
nodes would be idle and just waste CPU time.
We proposed a resource ratio algorithm which could gather the wasted resource into
account for executing jobs. This means we specify each node’s specification in to scores,
with these scores (CPU, RAM, hard disk) we could calculate total resource in our grid
environment. When dealing batch jobs or job queue is too long, the algorithm is bellow:
for each node in nodes-list
6. ICIC EXPRESS LETTERS, VOL.5, NO.10, 2011 3735
node’s score equals CPU-score adds RAM-score adds Disk-score;
for integer i less than Total-Job-Count∗(node’s score)/(Grid-Total-Score)
send job to that node;
while (job-list is not empty)
send one job to each node in node-list;
Each node in the grid environment would be as busy as possible for reducing CPU
waste ratio. Table 2 shows the CPU waste ratio using resource ratio algorithm.
Table 2. Resource ratio algorithm CPU waste ratio
Scenarios
Resource Ratio Algorithm CPU Waste Ratio (%)
Level-01 Job Level-02 Job Level-04 Job
Mix Nodes 8.726068 7.796550 8.306232
In Table 2, Mix Nodes has the same meaning as shown in Table 1. The overall CPU
waste ratio is reduced, and the average waste ratio is 8.276283%. This waste comes from
the lowest node which is idle for waiting other nodes finishing all jobs.
4. Conclusions. The grid computing environment is highly dynamic. In highly dynamic
environment, it is difficult to allocate all resources in the grid and invoke them appro-
priately. In this paper, we presented modules based distributed grid model. There are
four modules in the proposed model, namely, Resource module, Worker module, Com-
munication module, Storage module. By exchanging local information to each other with
Resource module, nodes could allocate and invocate resources in grid. With Redirector
component in Worker module, the heavy load nodes could redirect jobs to other nodes to
achieve load balancing. With nodes load balancing, the overall grid environment perfor-
mance would be improved.
Acknowledgment. The authors gratefully acknowledge the helpful comments and sug-
gestions of the reviewers, which have improved the presentation.
REFERENCES
[1] I. Foster and C. Kesselman, The Grid 2: Blueprint for a New Computing Infrastructure, Morgan
Kaufmann, San Francisco, 2004.
[2] I. Foster and C. Kesselman, Gloubs: A metacomputing infrastructure toolkit, International Journal
of Supercomputer Application, vol.11, no.2, pp.115-128, 1997.
[3] I. Foster, C. Kesselman and S. Tuecke, GRAM: Key Concept, http://www-unix.globus.org/toolkit/
docs/3.2/gram/key/index.html, 2008.
[4] Globus, http://www.globus.org, 2010.
[5] H.-M. Lee, C.-C. Hsu and M.-H. Hsu, A dynamic supervising model based on grid environment,
Lecture Notes in Computer Sciences, vol.3682, pp.1258-1264, 2005.
[6] H.-M. Lee, J.-S. Su and C.-H. Chung, Resource allocation analysis model based on grid environment,
International Journal of Innovative Computing, Information and Control, vol.7, no.5(A), pp.2099-
2108, 2011.
[7] H.-M. Lee, T.-Y. Lee, C.-H. Yang and M.-H. Hsu, An optimal analyzing resources model based
on grid environment, WSEAS Transactions on Information Science and Applications, vol.3, no.5,
pp.960-964, 2006.
[8] H.-M. Lee, T.-Y. Lee and M.-H. Hsu, A process schedule analyzing model based on grid environment,
Lecture Notes in Artificial Intelligence, vol.4253, pp.938-947, 2006.
[9] H. Li, D. Groep and L. Wolters, Workload characteristics of a multi-cluster supercomputer, Job
Scheduling Strategies for Parallel Processing, 2004.
[10] A. Silberschatz, G. Gagne and P. B. Galvin, Operating System Principles, 7th Edition, John Wiley
and Sons (Asia) Pte Ltd, 2004.
[11] F. Xhafa, J. Carretero and A. Abraham, Genetic algorithm based schedulers for grid computing sys-
tems, International Journal of Innovative Computing, Information and Control, vol.3, no.5, pp.1053-
1071, 2007.