This document summarizes a student report on optimizing virtual machine placement across geo-distributed data centers to minimize costs. It proposes using an optimization model to determine the optimal spare capacity allocation across data centers while considering electricity costs, demand variability, and other factors. It also describes using a heuristic algorithm to place VMs on physical machines across data centers in a way that minimizes operating costs like electricity and communication costs.
Energy-Efficient Task Scheduling in Cloud Environment
Summer Intern Report
1. Topic: Virtual machine placement with
optimised cost
Summer Intern Report
Submitted By:
Shantanu Bharadwaj
Dept. of Comp. Science & Engg.
IIT Guwahati
Under the guidance of:
Dr. T. Venkatesh
Dept. of Comp. Science & Engg.
IIT Guwahati
2. Abstract
Almost all modern online services run on geo-distributed data centers, and
fault tolerance is one of the primary requirements that decides the
revenue of the service provider. Growing number of internet services like,
web services, business transactions and cloud computing services are
being deployed over geo-distributed data centers. Geo-distribution is
important for latency, availability, and increasingly also for efficiency. Due
to rapid growth in the volume of demand served, large numbers of geo-
distributed data centers today can benefit from the same multi-megawatt
economies of scale that were initially limited to a few centralized ones. As
a result, modern cloud infrastructures are already highly geo-
distributed. Recent experiences have shown that the failure of a data
center (at a site) is inevitable. In order to mask the failure, spare compute
capacity needs to be provisioned across the distributed data center, which
leads to additional cost. While the existing literature addresses the
capacity provisioning problem only to minimize the number of servers,
this report describes that the operating cost needs to be considered as
well. Since the operating cost and client demand vary both across space
and time, we consider cost-aware capacity provisioning to account for
their impact on the operating cost of data centers. We propose an
optimization framework to minimize the Total Cost of Ownership (TCO) of
the cloud provider while designing fault-tolerant geo-distributed data
centers.
The second part of this report deals with the problem of VM placement.
When a virtual machine is deployed on a host, the process of selecting the
most suitable host for the virtual machine is known as virtual machine
placement, or simply placement. During placement, hosts are rated based
on the virtual machine’s hardware and resource requirements and the
anticipated usage of resources. The administrator selects a host for the
virtual machine based on the host ratings. The operating cost of the VM
placement has two important parameters: electricity cost and
communication cost. In Cloud Environment, the process of execution
requires proper Resource Management and Scheduling due to the high
process to the resource ratio. Resource Scheduling is a complicated task in
cloud computing environment because there are many alternative
computers with varying capacities. The goal of this project is to propose a
model for job-oriented resource scheduling algorithm in a cloud computing
environment. This report proposes a cost-aware heuristic approach for
optimal VM placement among a given number of physical machines in a
data center using resource scheduling techniques. The idea can be
extended to a group of data centers. The results show that the operating
cost has great potential of improvement via optimal VM placement.
3. Introduction
A data center is a facility to house computer systems and its associated
components like, telecommunications and storage systems. It is a
centralized repository, either physical or virtual, for the storage,
management and dissimination of data and information. A basic level data
center components are server, network and storage hardware. Other
components are, power, cooling, fire suppression, security systems and
network connectivity.
Geo-distributed data center is collection of small, geographically
distributed, fully automated data centers. Geo-distributed data centers
are popular because of the following reasons: first, reduced latency to the
clients as their requests are served by closer data centers. Second, they
are more effective in protecting data from catastrophes. Geo-distributed
data centers are gaining popularity because one data center is too small,
in addition to the above mentioned advantages over a single data center.
This is a general model of a Geo-distributed data center. In a broad way, it
handles two types of processes. They are:
Clients- Who wish to execute some operations or run some protocols.
Servers- Help implement operations, like storing data.
Business critical applications running in geo-distributed data centers
(henceforth simply referred data centers) demand high availability
because of huge loss of revenue, cost of idle employees and loss of
productivity associated with downtime. In addition outages lead to
reduced customer satisfaction, damaged brand perception and regulatory
4. problems. Instances of a data center failure at a site have been reported
by many cloud service providers like Amazon, Facebook and Google.
Data center unavailability can be due to reasons that vary across software
bugs, router misconfiguration in the Internet, human errors due to poor
supporting documentation and training, and man-made or natural
disasters. Due to these experiences from the industry, it is evident that
failure of a data center is inevitable. Designing a fault-tolerant geo-
distributed data center usually involves spare capacity provisioning
(allocation of additional servers to mask the failure) across different data
center sites, satisfying a set of constraints based on electricity prices,
infrastructure cost, operating cost, demand at each location, and delay
faced by customers. Henceforth, in this report, failure of a single data
center is the only kind of failure we consider.
Cloud computing is developing based on various recent advancements in
virtualization, Grid computing, Web computing, Utility computing, and
related technologies. Cloud computing provides both platforms and
applications on demand through the Internet or Intranet. Cloud
computing is a kind of Internet-based computing that provides shared
processing resources and data to computers and other devices on
demand. It is a model for enabling ubiquitous, on-demand access to a
shared pool of configurable computing resources (e.g., networks, servers,
storage, applications and services), which can be rapidly provisioned and
released with minimal management effort.
Resource scheduling plays an important role in Cloud data centers. One of
the challenging scheduling problems in Cloud data centers is the
consideration of the allocation of VMs. A data center is composed of a set
of hosts (PMs), which are responsible for managing VMs. A host is a
component that represents a physical computing node in a Cloud. It is
assigned a preconfigured processing capability (e.g., that expressed in
Million Instructions Per Second or GHz), memory, storage, and a
scheduling policy for allocating VMs. A number of hosts can also be
interconnected to form a cluster or a data center. In this chapter, we
introduce a framework for cost-efficient resource scheduling of real-time
VMs, considering only the computing resources.
5. Cost-aware Capacity Provisioning
Spare capacity provisioning across geo-distributed data center to mask
failure of a single data center, can be illustrated by a simple example.
Consider a distributed data center with 5 sites with a compute capacity of
20 units at each site. To mask the failure of any one data center at a time,
we require a spare capacity of 20/4 = 5 units at each of the remaining
data centers. Therefore the total spare capacity required is 5*5 = 25; So
the additional cost in building a fault-tolerant data center that can mask
single failure is 25%. The naive approach uniformly distributes the spare
capacity. However, all data centers do not have the same number of
servers and different locations are characterized by variation in the
electricity cost, bandwidth cost, carbon tax and varying user demand over
time. Therefore, the main challenge in designing fault-tolerant distributed
data center is to provision spare capacity so that along with capital cost
(cost of spare servers), operating cost is minimized while satisfying the
client latency even during the period of failure. Current literature proposes
an optimization framework with the objective of simply minimizing the
number of servers to meet the delay and availability constraints. But
operating cost across different geographical locations also needs to
minimised or optimised.
Considering the cost of a server to be $2000, and its lifetime to be 4 years
, we calculate the energy to acquisition cost (EAC) defined to be the ratio
of cost of running a server for its lifetime to its acquisition cost.
Power cost = 4 years * (8760 hours/year) * (electricity cost) * server
power * PUE
EAC = (power cost / server cost) * 100
6. PUE or Power Usage Effectiveness is the ratio of total amount of energy
used by a computer data center facility to the energy delivered to
computing equipment. It is a measure of how efficiently a computer data
center uses energy; specifically, how much energy is used by the
computing equipment (in contrast to cooling and other overhead).
PUE = Total Facility Energy / IT Equipment Energy
Higher EAC indicates more power and cooling cost than the server
acquisition cost. Therefore, lower the EAC, more feasible is the system.
This report formulates a mixed integer linear program (MILP) framework
for cost-aware capacity provisioning in fault tolerant geo-distributed data
centers to mask single data center failures. Along with cost of additional
servers, we also consider the variation in electricity prices across space
and time in determining the optimal capacity that
minimizes the operating cost.
Optimization Model
Assumptions:
Mechanism for failure detection and request re-routing is already
present.
Failure of only single data center (a site) is considered at a time.
Notations used:
7. Delay: Let Dmax
be the maximum latency allowed for a client based on the
service level agreements with the cloud provider. Let Dsu be the
propagation delay between user location u and data center location s. The
data center must be designed such that even after the failure of a site, the
latency continues to be lower than Dmax
.
Cost: Let S and U denote the set of data centers and client locations,
respectively. The cost of server (acquisition cost) is denoted by α. Let σs
denote the cost of access bandwidth.
Server Provisioning: Let ms denote the number of servers required in a
data center at s. We define Mmin and Mmax to be the minimum and
maximum number of servers that can be provisioned at any data center.
8. Power Consumption: Let Pidle be the average power drawn in idle
condition and Ppeak be the power consumed when server is running at peak
utilization. Then total power consumed at
a data center location s belonging to S, at hour h belonging to H, is:
Es is the PUE of data center s,
Average server utilization,
The TCO, which includes server acquisition cost and operating cost, is
defined as :
Subject to the following constraints:
The objective function is the sum of total cost incurred by all the individual
data centers over a day. The goal is to minimize the objective function,
that is, the total cost of ownership (TCO).
/*more stuff about code to be inserted*/
VM placement in distributed data centers:
In order to efficiently allocate computing resources; scheduling becomes a
very complicated task in a cloud computing environment where many
alternative computers with varying capacities are available. Efficient task
scheduling mechanism can meet users’ requirements and improve the
9. resource utilization. The cloud service providers often receive lots of
computing requests with different requirements and preferences from
users simultaneously. Some tasks need to be fulfilled at a lower cost and
less computing resources, while some tasks require higher computing
ability and take more bandwidth and computing resources.
In this report, only computing resources are considered. A data center is
composed of a set of hosts (Physical Machines), which are responsible for
managing VMs during their life cycles. A host is a component that
represents a physical computing node in a Cloud. It is assigned a
preconfigured processing capability (e.g., that expressed in Million
Instructions Per Second or GHz), memory, storage, and a scheduling policy
for allocating VMs. A number of hosts can also be interconnected to form a
cluster or a data center.
Data centers (probably distributed in different geographical multiple
systems) are the places that accommodate computing equipment and are
responsible for providing energy and air conditioning maintenance for the
computing devices. A data center could be a single construction or it could
be located within several buildings. Dynamic distribution manages virtual
and shared resources in the new application environment—Cloud
computing data centers face new challenges. Efficient scheduling
strategies and algorithms must be designed to adapt to different business
requirements and to satisfy different business goals.
Key technologies of resource scheduling include:
Scheduling strategies: It is the top level of resource scheduling
management, which needs to be defined by data center owners and
managers. It mainly determines the resource scheduling goals and
makes sure to satisfy all.
Optimization goals: Scheduling center needs to identify different
objective functions to determine the pros and cons of different types
of scheduling. Now there are optimal objective functions, such as
minimum costs, maximum profits, and maximum resource
utilization.
Scheduling algorithms: Good scheduling algorithms need to produce
optimal results according to objective functions.
GreenCloud architecture:
10. Proposed GreenCloud architecture
This figure describes a layered architecture for GreenCloud. There is a web
portal at the top layer for the user to select resources and send requests:
basically, it’s an uniform view of the few types of VMs that are
preconfigured for users to choose. Once user requests are initiated, they
go to the next level—CloudSched—which is responsible for choosing
appropriate data centers and PMs based on user requests. This layer can
manage a large number of Cloud data centers, consisting of thousands of
PMs. At this layer, different scheduling algorithms can be applied in
different data centers based on customer characteristics. At the lowest
layer, there are Cloud resources that include PMs and VMs, both consisting
of a certain amount of CPU, memory, storage, and bandwidth. At the
Cloud resource layer, virtual management is mainly responsible for
keeping track of all VMs in the system, including their status, required
capacities, hosts, arrival times, and departure times.
This report proposes a queuing model where a client requests virtual
machines for a predefined duration. Network Resources are not
considered at all. Jobs are assumed not to communicate with each other
or transmit or receive data. No preference is required as to where the VMs
are to be scheduled. An algorithm is proposed to optimally distribute VMs
in order to minimize the distance between user VMs in a data center grid.
The only network
constraint used is the Euclidean distance between data centers. No
specific connection requests or user differentiation is used. An algorithm is
11. proposed to schedule VMs within one data center to minimize
communication cost. No network topology is used. Rather, only the
monetary cost of transmitting data is considered for VM requests.
Real-time VM request model:
The Cloud computing environment is a suitable solution for real-time VM
service because it leverages virtualization. When users request execution
of their real-time VMs in a Cloud data center, appropriate VMs are
allocated.
A real-time VM request can be represented in an interval vector :
VMRequestID(VM typeID, start time, finish time, requested capacity).
For example, vm1(1, 0, 6, 0.25) shows that for VM request ID vm1, the VM
requested is of Type1 (corresponding to integer 1) with a start time of 0
and a finish time of 6 and 25% of the total capacity of Type1 PM.
Request formats can vary according to the definitions by data center
owners and managers.
In this report, the request format is as follows:
VMRequestID(VM typeID, start time, finish time, requested CPU capacity,
requested storage capacity).
For example, vm1(1, 0, 6, 2, 1) shows that for VM request ID vm1, the VM
requested is of Type1 (corresponding to integer 1) with a start time of 0
and a finish time of 6 and the request needs 2 units of CPU and 1 unit of
memory.
Assumptions in the proposed model:
All tasks are independent. There are no precedence constraints
other than those implied by the start and finish times.
Each PM is always available (i.e., each machine is continuously
available in [0, ∞)).
Each PM has an operating cost and communication cost associated
with it.
Each VM request has an electricity cost and communication
overhead associated with it.
Each PM is linked with every other PM in the system.
Each communication link is unidirectional.
12. The capacities of VMs and PMs are strongly divisible. If (P,V) denote
the list of capacities of PMs and VMs respectively, they are strongly
divisible if every item in list P exactly divides every item in list V.
That is, capacity demanded by VM requests are multiples of
capacities of PMs.
Proposed Algorithm:
The heuristic developed is based on first fit decreasing algorithm along
with some cost optimisation techniques. The VM requests are sorted
according to the decreasing order of their processing times. Each physical
machine has different operating costs at different hours. Each
communication link has a communication overhead associated with it.
Each VM request has an electricity and communication cost. The
algorithm compares the requested capacity with the capacity assigned to
physical machines, finds the physical machine with lowest cost and
assigns it to the VM request. If the capacity is not met, the communication
costs for sending the request to other physical machines are considered
and the physical machine with minimum cost is found out and assigned
again to the VM request.
The pseudo-code for the algorithm is as follows:
Input:
Number of VM requests
VM requests (indicated by their VM ID, start time, finish time, CPU
capacity, storage capacity)
Number of PMs
PM (PM ID, CPU capacity, storage capacity)
Number of hours the whole system will run
Operating cost of each PM at every hour
Communication cost for each link
Output:
13. PM ID of physical machine assigned to a particular VM request along with
the cost incurred.
Pseudo Code:
n <- number of VM requests
m <- number of PMs
h <- number of hours
eij <- electicity cost of physical machine i at hour j
dij <- communication overhead of link between PM i and PM j
Rei <- operating cost of VM request i
Rbi <- communication cost of VM request i
ei <- average electricity cost of PM i
bi <- average communication cost of PM i
vij <- cost of allocating PM j to request i
v_min <- minimum cost
machinei <- physical machine selected for VM request i
for i = 1 to n do
processing timei = finish timei – start timei
sort (processing timei)
for i = 1 to m do
for j = 1 to h do
ei = find_average (eij)
for i = 1 to m do
for j = 1 to m do
if (i=j)
dij = 0
else
bi = find_average (dij)
for i = 1 to n do
for j = 1 to m do
vij = ( ej * Rei ) / ( ej + bj )
for i = 1 to n do
for j=1 to m do
v_min = find_minimum(vij)
machinei = j
if(capacity_request <= capacity_machine)
14. jth
machine is allocated to request i
else
capacity_remaining = capacity_request – capacity_machine
for i = 1 to n do
for j = 1 to m do
vij = (( ej * Rei ) / ( ej + bj )) + (( bj * Rbi ) / ( ej + bj )) + (( ej+1 *
Rei ) / (ej+1 + bj+1))
This process is repeated till the requested capacity of the VM request is
not met.
Results:
A small example has been taken to show the working of the algorithm. In
this example, the data center has 3 PMs with given capacities i.e, 2 units of
CPU and 1 unit of storage. 3 VM requests are considered with varying start
times, end times, and capacities. The goal is to allocate PMs to them such
that, the cost is minimised. The costs considered are average operating cost
of PMs, average communication cost of PMs, electric cost of VM requests,
communication overhead of VM requests. The output of the algorithm
implemented, is as follows:
15.
16. In this algorithm, if there are three nodes and three VMs are to be
scheduled, each node would be allocated one VM, provided all the nodes
have enough available resources to run the VMs. The main advantage of
this algorithm is that it utilizes all the resources in a balanced order.
Comparison with Traditional method:
The simplest approach for this problem is based on the idea that, no
sorting of the VM request IDs are done. It is the traditional approach for
the problem, based on Round Robin Scheduling. The requests are served
in a FCFS manner and the PM with lowest average operating cost is
assigned to the requests. If the capacity is not met, the PM with next
lowest average operating cost is assigned and the process goes on. No
communication overhead is introduced.
Taking the same values from the example taken above, this is the order
and cost of PMs being assigned to VM requests:
Request
ID
Start
time
End time Capacity(C
PU)
Capacity(Stor
age)
Re(electric
ity cost)
1 0 2 4 2 100
2 0 6 2 1 50
3 0 4 2 1 50
Avg operating cost=6 Avg operating cost=5
Avg operating cost=7
Naturally, PM 2 will be the first choice of every request as its average
operating cost is the least.
Cost of running VM1 on PM2 is 500
Capacity not met
17. PM1 has the next lowest average operating cost
Cost of running VM1 on PM1 is 600
Cost of running VM2 on PM2 is 250
Cost of running VM3 on PM2 is 250
Request 1 Request 2 Request 3
0
200
400
600
800
1000
1200
Our Heuristic
Round Robin
Advantages of Resource Scheduling Algorithm:
Easy access to resources and Better resource utilization.
In this report, implementation of the optimized algorithm is
compared with the traditional task scheduling algorithm. The main
goal of this optimized algorithm is to optimize the cost as compare
to the traditional ones.
This algorithm improves the traditional cost-based scheduling
algorithm for making appropriate mapping of tasks to resources.
This algorithm computes the priority of tasks on the basis of
different attributes of tasks and after that applies sorting on tasks
onto a service which can further complete the tasks.
Conclusion:
Thus, this report argued the need for cost-aware capacity provisioning for
geo-distributed data centers that can tolerate failure of a single data
18. center. We proposed an MILP optimization model that reduces the total
cost of ownership (TCO) (includes capital and operating cost factors) while
provisioning the servers across different locations with varying running
cost factors.
This report also stated that Scheduling is one of the most important tasks
in cloud computing environment. Priority is an important issue of job
scheduling in cloud environments. The heuristic developed using resource
scheduling techniques is thus helpful in minimising the cost incurred
during VM placement.