Mais conteúdo relacionado
Semelhante a 2nd Eucalyptus Bay Area Meet Up with Rich Wolski (20)
Mais de Eucalyptus Systems, Inc. (7)
2nd Eucalyptus Bay Area Meet Up with Rich Wolski
- 1. Eucalyptus
Architecture and
Implementation
Rich Wolski, CTO
March 1, 2012
© 2012 Eucalyptus Systems, Inc. -- confidential
- 2. Eucalyptus Multi-tiered
Service Architecture
Service
User Requests
Delivery
User Transactions
Inventory and Inventory and Inventory and
Scheduling Scheduling Scheduling
Actualization Actualization Actualization Actualization Actualization
© 2012 Eucalyptus Systems, Inc. -- confidential
- 3. Eucalyptus Components
• Cloud Controller (CLC)
– User request processing (except for Walrus), Credentials
management, VM (instance) state management
• Walrus (S3)
– S3 user request processing, Append-only, Put/Get object storage
• Cluster Controller (CC)
– VM inventory, Network provisioning/security group implementation
• Storage Controller (SC)
– Block level, network attached storage (SAN and Linux)
• Node Controller (NC)
– Hypervisor interface and control, VM launch/decommissioning
• VMWare Broker
– Gateway between CC and ESX and/or vSphere for VMWare
© 2012 Eucalyptus Systems, Inc. -- confidential
- 4. Component Architecture
Service
User Requests
Delivery
CLC Walrus
CC SC CC SC CC SC
NC/ NC/ NC/ NC/
NC/
VMWareB VMWareB VMWareB VMWareB
VMWareB
© 2012 Eucalyptus Systems, Inc. -- confidential
- 5. Eucalyptus Generations
• Eucalyptus 1.X (June 08 through Sep. 10)
– University code
• Eucalyptus 2.X (June 10 through Feb. 11)
– Commercial focus, early production
• Eucalyptus 3.X (present - )
– Production operational improvements
– Full commercial feature set (almost)
• Few, if any features deprecated
– BitTorrent?
© 2012 Eucalyptus Systems, Inc. -- confidential
- 6. New Eucalyptus 3.0 Features
• High-availability (HA) of the Eucalyptus Service
– Hot fail-over and repair for all components except NC
• AWS Identity and Access Management (IAM) API plus
extensions for private clouds
– Quotas and metering
• Eucalyptus Block Storage improvements
– AWS Volume-backed instance API (persistent instances)
“bootable”
– NetApp and JBOD support added to existing Dell Equallogic
• Full support for Windows images
– Seven different versions, AWS compatible authentication,
sysprep, ephemeral disk
• Accounting/Usage reporting
– Charge-back interface linked to quotas
© 2012 Eucalyptus Systems, Inc. -- confidential
- 7. Eucalyptus 3.0 Platform Improvements
• Revamped image caching in the NC
– Faster instance starts using copy-on-write
• Refactored VMWare broker
– Faster and more robust image preparation, support for vSphere 4.X,
improved scale, more extensive deployment topologies
• Extended Linux distro support
– RHEL 5 and RHEL 6, packages for Canonical LTS (Ubuntu 10.04)
• Substantial improvement in automated QA
– Full QA sequence is 5 days (features + distros + hypervisors +
deployment topologies + networking modes)
• Re-designed administrative webUI
• Improved command-line admin tools
• Re-designed packaging, upgrade and dependency management
• Re-designed installation mechanism (package repositories)
© 2012 Eucalyptus Systems, Inc. -- confidential
- 8. Eucalyptus in The Wild
• Eucalyptus 2.0 Deployments
– Games, mobile infrastructure, media, telecom
• Tons of feedback
– Not all of it angry
• Top 3
– Platform HA -> VM connectivity and request service
– Quotas, accounting, reporting
– Windows (fast image creation and start)
© 2012 Eucalyptus Systems, Inc. -- confidential
- 9. High Availability
• Eliminate single point of failure
– Host failure
– Network connectivity failure (including network partitions)
• Tolerate as many multiple failure cases as possible
• Avoid data loss at all costs
– Fail stop is better than data loss
• Availability of the services that Eucalyptus offers
– Eucalyptus requests
– VM connectivity and storage
– Not VM HA -> application level
© 2012 Eucalyptus Systems, Inc. -- confidential
- 10. HA Web Service Architecture
• All Eucalyptus components are implemented as Web
Services
– CLC, Walrus, SC, VMWare Broker– Java
– CC and NC - C
• CC and NC are each implemented in separate Axis2c
service container
• CLC, Walrus, SC, and VMWare Broker share a web
service stack and JVM when co-located
© 2012 Eucalyptus Systems, Inc. -- confidential
- 11. PoC Configuration
VM VM
SC SC
Wb Wb
Walr Walr
CLC us
CLC us
Web Service Web Service
DB management DB management
CC CC
Linux Linux
NC NC NC NC NC
© 2012 Eucalyptus Systems, Inc. -- confidential
Linux Linux Linux Linux Linux
- 12. Multi-component Failure
VM VM
SC SC
Wb Wb
Walr Walr
CLC us
CLC us
Web Service Web Service
DB management DB management
CC CC
Linux Linux
NC NC NC NC NC
© 2012 Eucalyptus Systems, Inc. -- confidential
Linux Linux Linux Linux Linux
- 13. Production
CLC CLC Wal Wal CC CC
VM VM
SC SC
b b
NC NC NC NC NC
Linux Linux Linux Linux Linux
© 2012 Eucalyptus Systems, Inc. -- confidential
- 14. Group Membership and
Heartbeat
• HA is from the perspective of the “master” CLC
• Jgroups determines which machines are “up”
– The network connecting the “up” machines is unpartitioned
• Heartbeat determines which services are available within
the “up” group
• Back-up CLC monitors the “up” group to determine if it
contains a master
– If not, it becomes the master
• Master and Back-up DBs kept synced
– Resync when failed CLC is restored
© 2012 Eucalyptus Systems, Inc. -- confidential
- 15. Interesting Wrinkles
• CLC and Walrus have externally visible URLs
– DNS remapping service is built into the CLC
• What happens if the master loses connectivity with the
user?
– Back-up may have an alternative path to user
– If DNS remaps, and the back-up becomes active, the system
may experience a “split brain”
• Fail stop
• Arbitrator service
• Multi-failure can cause split brain
– Master fails over, new master fails before original back,
original then brought up => fail stop
© 2012 Eucalyptus Systems, Inc. -- confidential
- 16. IAM, Quotas, and Reporting
• IAM is AWS “Identity and Access Management”
– Accounts and users, and groups of users
– JSON based policies defines calls that users and groups can
execute
– Also possible to attach policies to resources S3 (buckets for
now)
• Eucalyptus extends the IAM predicates with inequalities
– Implements quotas as tests against IAM policies
• Resource usage information exportable in a variety of
formats and through GUI
© 2012 Eucalyptus Systems, Inc. -- confidential
- 17. For Example
eucalyptus dev support sales
{
"Version":"2012-‐02-‐12",
"Statement":[{
"Sid":"2",
quota
"Effect":“Limit",
EC2 image permission
"Action":"ec2:RunInstances",
"Resource":"*",
"Condition":{
S3 bucket ACL
"NumericLessThanEquals":{
"ec2:quota-‐vminstancenumber":
quota "256"
}
}
}]
}
© 2012 Eucalyptus Systems, Inc. -- confidential
- 18. Evaluation Logic
Account admin
or
Account-level IAM user policy Allocating Exceeding
permission satisfied? allowed?
Sys admin? resources? Quota?
Accept
No Yes Yes Yes No
Yes No No No Yes
Accept Reject Reject Accept Reject
© 2012 Eucalyptus Systems, Inc. -- confidential
- 19. Windows
• Windows images are big
– One customer wants 200 GB images
– Ephemeral within the C: drive
• Need a way to use CoW to improve Windows launch time
© 2012 Eucalyptus Systems, Inc. -- confidential
- 20. The Blob Store
• Blobs are (sparse) files on the file system
– remember to use ‘ls –s’ to see disk space allocated
– files are mounted on loopback when in use
– future implementation could use LVM volumes instead of files
• Mapping and copy-on-write snapshots are implemented
using Linux kernel’s device-mapper (same as LVM
snapshots)
– once snapshotted or mapped, file access method cannot be
used
– i.e., backing file on disk no longer has the bits you want
© 2012 Eucalyptus Systems, Inc. -- confidential
- 21. Image -> Instance in the NC
Walrus
EMI ERI EKI
Eucalyptus Linux Image on NC
download download download
NC copy NC
cache EKI EKI work
area space
copy
ERI ERI
snap
EMI EMI + KEY
map
snap
mkfs.ext3 ephemeral0 ephemeral0
map
snap
mkswap swap swap
zero
snap map
PT
EMI + KEY ephemeral0 swap
• NC’s cache keeps objects from • EKI and ERI are copied to work
Walrus and partitions created space due to libvirt requirement
from scratch, one per size/type • Other objects are snapshotted,
• LRU eviction policy for non- tuned, and then mapped to
pinned objects limits disk use
© 2012 Eucalyptus Systems, Inc. -- confidential
compose the disk
- 22. What’s Next?
• Eucalyptus 3.1 (Q2)
– Refactoring for packaged plug-ins
– Postgres instead of MySQL
• Eucalyptus 3.2 (Q4)
– Feature release
– Possibilities
• ELB, Cloudwatch, Autoscaling
• Tags
• Eucalyptus 4 in 2013 and Eucalyptus 5 in 2014
– Application features -> services and API
– Operational features -> ease of use, maintenance,
performance
• Please help! – tell us what Eucalyptus needs and when it
needs it
© 2012 Eucalyptus Systems, Inc. -- confidential
- 23. Thanks!
Questions?
• rich@eucalyptus.com
• @richwolski
© 2012 Eucalyptus Systems, Inc. -- confidential