1. Starting for the cloud
-- two issuses in cluster:
resource allocation and overload
management
Ziyou Wang, Yan Li, Chao You, Minghui Zhou
Peking University
wangzy06@sei.pku.edu.cn
zhmh@pku.edu.cn
OW2 Annual Conference 2010, November 24-25, La Cantine, Paris.
www.ow2.org.
3. Cloud Computing: Challenges
The emergence of cloud computing makes it a cost-efficient way
for application providers to lease the computing resources from a
third provider
Benefit: increase resource utilization, improve business agility,
decrease power consumption…
But how to effectively allocate various resources in cloud to
different applications is still an open problem.
When the applications host in the cloud face with overload, which
means the demand on at least one of the cloud’s resources exceeds
the capacity of that resource, what can we do to handle this
situation?
… …
OW2 Annual Conference 2010, November 24-25, La Cantine, Paris.
www.ow2.org.
4. Shared Cluster
Considering one kind of cloud implementation: the workloads of
different web applications are not correlated, a large-scale cluster,
called shared cluster or data center, is maintained to host a large
number of applications simultaneously
Each application runs on a subset of nodes
Each node may run multiple applications Enterprises
Users
Third parties
OW2 Annual Conference 2010, November 24-25, La Cantine, Paris.
www.ow2.org.
5. Resource Allocation: a scenario
As the cluster’s resources are no longer occupied by one
application, it requires the cluster to allocate the resources on
demand Applica>on
users
For example
An increaseinstances
Place new of
re-allocate Dispatcher
appworkload
in the dataworkload
A,C’s center
Node
1
Node
16
app
A
app
C
app
A
app
B
Node
150
middleware middleware app
C
app
D
middleware
High-‐throughput
Node
99
low-‐latency
network
app
B
app
A
Other
Repository
middleware nodes
… Apps
Shared
cluster
OW2 Annual Conference 2010, November 24-25, La Cantine, Paris.
www.ow2.org.
6. Self-adaptive Resource Allocation
Model
Self-‐adap4ve
resource
alloca4on
Requests Resource
alloca>on
Resource
alloca>on
planning
execu>on
OW2 Annual Conference 2010, November 24-25, La Cantine, Paris.
www.ow2.org.
7. Our Resource Allocation Work
Middleware
Resource
Dispatcher Management
… alloca>on
… Console
planning
messages requests
commands
coopera>on
Middleware
Resource
alloca>on
Communicator
planning VM VM
Local
valuator
customized
… customized
app
a JOnAS app
x JOnAS
Resource
alloca>on
Resource
par>>oner
execu>on App
deployer Virtual
Machine
Monitor
Repository For the resource allocation planning, we propose a
decentralized resource allocation planning approach
• odes decide their own resource allocation
N
• arket-based coordination is adopted to help them
M
make the resource decision
Until now, the approach is evaluated with a serial of
simulated experiments, and is being implemented in
the cluster with JO2nAS
OW2 Annual Conference 2010, November 24-25, La Cantine, Paris.
www.ow2.org.
8. Resource Allocation Planning
To support application prioritization, applications can be assign
with the different utility values. Accordingly, the goal of resource
management is to maximize the total utility values of the requests
satisfied
Inspired by human market, we model the shared cluster as a
market, where shares of application requests are treated as goods
and nodes as dealers to exchange goods
Basing on local valuation of the goods, each node autonomously
and continuously trades with others in order to find an application
share combination which fits the node’s resource constrains and
maximize its income
OW2 Annual Conference 2010, November 24-25, La Cantine, Paris.
www.ow2.org.
9. Resource Allocation Planning
When a node wants to sell, more than one node may want to buy.
To make the seller transfer the goods to the appropriate buyers, an
auction mechanism is adopted
4. inform (app C , 30% to n50, 20% to
n65)
Sell C 30%
Dispatcher
Node 1 4. notify 2.2 want C, 35%
N1: … (app C, 70%) 3. sort Node 50
app A app C
N50: … app A app B ...
middleware
N65: …(app C, 10%) middleware 2.1 valuation
N100: … (appC, 50%, 100 req/ 1. multicast Nodes
4 notify s)
app
...
update
Sell C 20% middleware
Node 65 Node 100
N1: … (app C, 20%)
2.1 valuation
N50: … (app C,30%)
app B app C ... app B app D
...
2.2
N65: …(app C, 30%)
middleware middleware
N100: … want C, 20% 2.1 valuation
OW2 Annual Conference 2010, November 24-25, La Cantine, Paris.
www.ow2.org.
10. Our Resource Allocation Work
Middleware
Resource
Dispatcher Management
… alloca>on
… Console
planning
messages requests
commands
coopera>on
Middleware
Resource
alloca>on
Communicator
planning VM VM
Local
valuator
customized
… customized
app
a JOnAS app
x JOnAS
Resource
alloca>on
Resource
par>>oner
execu>on App
deployer Virtual
Machine
Monitor
Repository
For the resource allocation execution
• ntegrate a VMM into the middleware
I
• utomatically load the app and partition the resource at
A
runtime via VMM
• ustomize JOnAS for the app, and store the customized
C
image in the repository
• roportionally workload dispatching
P
Now, we use Open VZ, a lightweight OS level VMM, as a
case study, and are trying to integrate OpenVZ into the
middleware
OW2 Annual Conference 2010, November 24-25, La Cantine, Paris.
www.ow2.org.
12. Examples
On September 11th 2001, for instance, the workload on a
popular news web site increased by an order of magnitude in
30 min, with the workload doubling every 7 min in that
period.
April 21th 2010, is the China National Mourning for Yushu
Quake Victims. Theatre and sporting performances are
cancelled, karaoke bars shut and the culture ministry has
ordered suspension of all online music, games, comics, films
and TV shows.
Too many people choose to visit an online shopping site.
OW2 Annual Conference 2010, November 24-25, La Cantine, Paris.
www.ow2.org.
13. When overload happens?
Overload prevention is a critical goal so that a system can remain
operational in the presence of overload even when the incoming
request rate is several times greater than the system’s capacity.
It is well known that the workload seen by Internet applications
varies over multiple time-scales and often in an unpredictable
fashion.
Unexpected things are always happening:
Featured on national television or in a major newspaper.
Under-provisioning for sales-boosting holidays
OW2 Annual Conference 2010, November 24-25, La Cantine, Paris.
www.ow2.org.
14. The TaoBao Architecture
Apache + Application Server + MySQL
200+ applications, thousands of components
12k servers
2k~3k java servers
Shop
Search Cart
Product Product Recommendation
Browsing
/46
OW2 Annual Conference 2010, November 24-25, La Cantine, Paris.
www.ow2.org.
15. The Reality – Manual Service
Degradation
In response to overload:
CNN replaced its front page with simple HTML page that could
be transmitted in a single Ethernet packet .
Taobao turned off a sub system.
All these techniques are implemented manually, though a better
approach would be to degrade service gracefully and automatically
in response to load.
Which point causes overload?
Which resource is the bottleneck?
Which service should be degraded or turned off?
All user be affected or not?
OW2 Annual Conference 2010, November 24-25, La Cantine, Paris.
www.ow2.org.
16. Automatic Degradation Mechanism
Overload Priority defines the priorities of different services and
degradation actions can be taken.
Overload Detection is responsible for signaling the occurrence of
instable status of the application.
Overload Localization is triggered to locate the bottleneck of resources.
Overload Controller will take appropriate actions to degrade some
unnecessary services to release more resources to support key services.
Mechanism
Overload
Degradation
Overload Priority Overload Controller Service Service Service
Service Service
Overload
OW2 Annual Conference 2010, November 24-25, La Cantine, Paris.
www.ow2.org.
18. Considerations
Hard to be transparent to the user ( what can de degraded?
sometimes how?)
Using it alone can contribute to delay overload, but it needs to be
combined with other techniques to be fully effective.
Dynamic resource allocation
Admission control
Service differentiation
… …
OW2 Annual Conference 2010, November 24-25, La Cantine, Paris.
www.ow2.org.