SlideShare uma empresa Scribd logo
1 de 27
Baixar para ler offline
1
OpenStack Summit
May 12-16, 2014
Atlanta, Georgia
Enhancing High Availability
in the Context of OpenStack
Qiming Teng
tengqim@cn.ibm.com
IBM Research
2 © 2014 IBM Corporation
Agenda
 High Availability (HA) Overview
 Four Types of HA in OpenStack
 OpenStack HA
 VM/Application HA Options
‒ VM/App HA Orchestrated
‒ Open Questions
 HA as a Service?
3 © 2014 IBM Corporation
High Availability Overview
 Why HA?
‒ Single system
• Hardware failures
• Hypervisor defects
• OS (host/guest) crashes
• Application bugs
‒ In cloud
• Shared, virtualized storage
• Shared, virtualized network
 Use cases
‒ Server consolidation in private cloud
‒ Selling point for public cloud
‒ Ease of management
• Planned/unplanned downtime
‒ (potentially) a user consumable service
4 © 2014 IBM Corporation
How to Achieve HA?
 Three Technologies
‒ Redundancy
• Capacity Planning
• Cost
‒ Detection
• Watchdog
• Heartbeat messages
‒ Recovery
• Transparency
• Data consistency
• Interruption time
 Implications
‒ Automatic
‒ Autonomous
OPERATING
FAILURE RECOVERING
5 © 2014 IBM Corporation
Four Types of High Availability in an OpenStack Cloud
Host
OpenStack
VM
Application
• Physical nodes
• Physical network
• Physical storage
• Hypervisor
• Host OS
• …
• Compute Controller
• Network Controller
• Database
• Message Queue
• Storage
• …
• Service Resiliency
• Quality of Service
• Cost
• Transparency
• Data Integrity
• ...
• Virtual Machine
• Incl. Container
• Virtual Network
• Virtual Storage
• VM Mobility
• Ease of Management
• ...
6 © 2014 IBM Corporation
OpenStack HA: Deployment Pattern
 Main Focus
‒ Avoid SPOF (Single Point of Failure) in OpenStack services
• Controller, Network, Compute, Swift, etc.
‒ Stateful versus Stateless services
 Implementation
‒ Primarily based on Pacemaker/Corosync Linux-HA stack, plus
a load-balancer
‒ Keepalived/haproxy
 A Deployment Pattern, not part of OpenStack core
components
‒ HA Guide documentation
‒ Chef cookbooks
‒ TripleO elements
 Only deployment, no runtime management service
7 © 2014 IBM Corporation
An example setup (RDO)
Following chart are closing charts to promote IBM
sessions, demos, etcetera
8 © 2014 IBM Corporation
OpenStack HA: Intrinsic Supports
 Nova
‒ Host Aggregates
‒ Availability Zones
‒ Service Groups
• Internal heartbeat messages, zookeeper/memcached/matchmaker
‒ …
 Message Queues
‒ QPID heartbeats (60 seconds interval)
‒ ZeroMQ w/ MatchMaker
 Cinder
‒ Storwize driver (heartbeat: 10 seconds)
‒ Contrib services
 Swift
 ...
9 © 2014 IBM Corporation
OpenStack HA: Internal Heartbeats
[tengqm@node1 ~]$ nova service-list
+----+------------------+-------+----------+---------+-------+----------------------------+-----------------+
| Id | Binary | Host | Zone | Status | State | Updated_at | Disabled Reason |
+----+------------------+-------+----------+---------+-------+----------------------------+-----------------+
| 1 | nova-conductor | node1 | internal | enabled | up | 2014-04-27T20:37:09.000000 | - |
| 3 | nova-cert | node1 | internal | enabled | up | 2014-04-27T20:37:05.000000 | - |
| 4 | nova-scheduler | node1 | internal | enabled | up | 2014-04-27T20:37:06.000000 | - |
| 5 | nova-consoleauth | node1 | internal | enabled | up | 2014-04-27T20:37:05.000000 | - |
| 6 | nova-compute | node1 | nova | enabled | up | 2014-04-27T20:37:06.000000 | - |
+----+------------------+-------+----------+---------+-------+----------------------------+-----------------+
[tengqm@node1 ~]$ neutron agent-list
Starting new HTTP connection (1): 9.186.106.171
Starting new HTTP connection (1): 9.186.106.171
+--------------------------------------+--------------------+-------+-------+----------------+
| id | agent_type | host | alive | admin_state_up |
+--------------------------------------+--------------------+-------+-------+----------------+
| 0f9b8470-577e-4439-84f1-36ce92eac77d | Metadata agent | node1 | :-) | True |
| 7ac10787-9a62-4a96-868f-bd90bb46d52b | L3 agent | node1 | :-) | True |
| c89d0bac-8a41-44ee-8df0-389a9c8db428 | Open vSwitch agent | node1 | :-) | True |
| e138db2d-bf3b-4ac2-89ab-50dbb8771a7b | DHCP agent | node1 | :-) | True |
+--------------------------------------+--------------------+-------+-------+----------------+
10 © 2014 IBM Corporation
VM/Application HA: Guest Clusters
hardware
Host OS / Hypervisor
App
App
Guest OS
App
App
Guest OS
App
App
Guest OS
App
App
Guest OS
nova
neutron
cinder
glance
…
heat
SCs/SDs
hardware
Host OS / Hypervisor
11 © 2014 IBM Corporation
VM/Application HA Timeline (reboot VM-2)
21:22:56 rgmanager Shutting down
rgmanager Stopping service load-balancer
rgmanager [script] Executing /etc/init.d/lbsvc stop
21:23:00
21:23:03
21:23:05
21:23:06
21:23:07
rgmanager [ip] Removing IPv4 address 10.0.2.212/24 from eth1
rgmanager Service load-balancer is stopped
rgmanager Resource groups locked; not evaluating
[CPG] got procleave message from node 2 (1)
dlm_controld: stop distributed lock manager (2)
21:23:04
rgmanager Dbus Released
rgmanager Stopped 1 service
rgmanager Disconnecting from CMAN
rgmanager Pausing to allow services to start on other nodes
[CPG] got procleave message from node 2
rgmanager Member 2 shutting down
dlm_controld: stop distributed lock manager (3)
rgmanager Evaluating load-balancer, stopped, owner none
rgmanager event (0:2:0) Processed
rgmanager Starting stopped service load-balancer
rgmanager [ip] Link for eth1: Detected
rgmanager [ip] Adding IP addr 10.0.2.212/24 to eth1
rgmanager [ip] Pinging addr 10.0.2.212 from dev eth1
rgmanager Event: Port Closed
rgmanager Node 2 is not listening
rgmanager [ip] Sending gratuitous ARP: 10.0.2.212
fa:16:3e:b4:69:b8 brd ff:ff:ff:ff:ff:ff
rgmanager [script] Executing /etc/init.d/lbsvc start
21:23:08
rgmanager Service load-balancer started rgmanager 1 events processed
VM-2(rebooted)VM-1
12 © 2014 IBM Corporation
VM/Application HA: Guest Clusters
App
App
Guest OS
App
App
Guest OS
App
App
Guest OS
App
App
Guest OS
nova
neutron
cinder
glance
…
heat
SCs/SDs
Service X
Image from
Dept A
Application Y
Image from
Dept B
Service X
Image from
Dept A
Service Z
Image from
Dept C
hardware
Host OS / Hypervisor
hardware
Host OS / Hypervisor
LIMITATIONS
- Ease of management - Application Specific - Intrusive
13 © 2014 IBM Corporation
VM/Application HA: Intrinsic Supports
 Redundancy
‒ Nova
• Server Groups
• Virtual Ensembles ?
• Virtual Clusters ?
‒ Heat
• InstanceGroup resource
• ResourceGroup resource
 Detection
‒ RPC notification, oslo.messaging
‒ Ceilometer
 Recovery
‒ Fencing support in nova, cinder, neutron [undergoing]
‒ VM reboot, rebuild, evacuation …
‒ OS::Heat::HARestarter resource in Heat (deprecating)
‒ …
Ceilometer Notification
14 © 2014 IBM Corporation
VM/Application HA: Heat Orchestrated – yesterday
Nova Server
heat-cfntools
heat-api-cloudwatch
Alarm
heat-engine
HARestarter
heat-api-cfn
Template
15 © 2014 IBM Corporation
VM/Application HA: Heat Orchestrated – yesterday
Nova Server
crond
cfn-hup
cfn-push-stats heat-api-cloudwatch
Alarm
heat-engine
create_watch_data
Watch_rule
HARestarter
restart()
delete; create
BOTO
heat-api-cfnBOTO
MQ
16 © 2014 IBM Corporation
VM/Application HA: Heat Orchestrated – today
Nova Server
os-collect-config
cfn-push-stats
ceilometer
Alarm
heat-engine
Alarm
BOTO
nova
MQ
metadata
heat-api-cfnMQHARestarter
restart()
delete; create
heat-api-cloudwatch
MQ
create_watch_data
metadata
HTTP
17 © 2014 IBM Corporation
VM/Application HA: Heat Orchestrated – tomorrow?
Nova Server
os-collect-config
???
ceilometer
Alarm
heat-engine
Alarm
MQnova
MQ
metadata
VMCluster
“Restart”
metadata
HTTP
notification
MQ
1
Application Heartbeats
2
VM Heartbeats
3
A native signal / alarm
4
A notion of VM groups for
• VM/Application redundancy
• HA policies
5
Native support VM and
application recovery:
• reboot
• rebuild
• migrate
• remote-restart
18 © 2014 IBM Corporation
VM/Application HA: Open Questions
 Physical placement of VMs
‒ No shared PDU/rack, no shared network switch
‒ HA-aware scheduling, e.g. server priority
 Detection of failures
‒ High availability and QoS, e.g. desired latency/throughput versus reality
‒ Reliable detection, application involvement, …
 Reasoning of failures
‒ Root cause, trend analysis
‒ Log collection and analysis
 HA management / orchestration
‒ As a cross-cutting concern, involving not only compute, but also storage and network
• Stack availability?
‒ Capacity planning / reservation
 Leverage existing HA software
‒ Can we leverage supports from hypervisors?
‒ Can/should we generalize this into a service?
19 © 2014 IBM Corporation
High Availability as a Service (HAaaS)
 Generic HA management service
‒ Applicable to different levels of HA
• Host, VM, App, OpenStack
‒ Applicable to different hypervisors
• vSphere, KVM, Xen, HyperV, PowerVM, …)
‒ Functionality determined via user authentication
 Well-defined service APIs
‒ Clusters management
‒ Application/Service resource definition
‒ HA policies
• Fail-over domain
• Fail-over priority, operation, timeout, retries, …
20 © 2014 IBM Corporation
HAaaS: OpenStack HA
VM or Physical Servers
have have
21 © 2014 IBM Corporation
HAaaS: VM HA
Physical Server
have have
22 © 2014 IBM Corporation
HAaaS: Application HA
VMs
have have
23 © 2014 IBM Corporation
Service Interfaces (1)
 Cluster Management
‒ cluster_create,
‒ cluster_destroy,
‒ cluster_start,
‒ cluster_stop,
‒ cluster_suspend,
‒ cluster_resume,
‒ cluster_set_attr,
‒ cluster_get_attr,
‒ cluster_by_host,
‒ cluster_get_status,
‒ cluster_get_log,
‒ ...
 Node Management (physical/virtual)
‒ node_join_cluster
‒ node_leave_cluster
‒ node_get_attr
‒ node_set_attr
‒ node_startup
‒ node_shutdown
‒ node_reboot
‒ node_evacuate
‒ node_get_status
‒ ...
24 © 2014 IBM Corporation
Service Interfaces (2)
 Resource Management
‒ resource_create
‒ resource_destroy
‒ resource_get_attr
‒ resource_set_attr
‒ ...
 Fencing Management
‒ Fencing_dev_add
‒ Fencing_dev_del
‒ Fencing_dev_associate
‒ Fencing_dev_deassociate
‒ Fencing_dev_set_opts
‒ Fencing_dev_get_opts
‒ ...
 Service Management
(aka. resource groups)
‒ service_create
‒ service_destroy
‒ service_add_resource
‒ service_del_resource
‒ service_list
‒ service_get_attr
‒ service_set_attr
‒ service_start
‒ service_stop
‒ service_restart
‒ service_relocate
‒ ...
25 © 2014 IBM Corporation
Monday, May 12 – Room B314
12:05-12:45
Wednesday, May 14 - Room B312
9:00-9:40
9:50-10:30
11:00-11:40
11:50-12:30
OpenStack is Rockin’ the OpenCloud Movement! Who‘s Next to Join the Band ?
Angel Diaz, VP Open Technology and Cloud Labs
David Lindquist, IBM Fellow, VP, CTO Cloud & Smarter Infrastructure
Getting from enterprise ready to enterprise bliss - why OpenStack and IBM is a match made in Cloud heaven.
Todd Moore - Director, Open Technologies and Partnerships
Taking OpenStack beyond Infrastructure with IBM SmartCloud Orchestrator.
Andrew Trossman - Distinguished Engineer, IBM Common Cloud Stack and SmartCloud Orchestrator
IBM, SoftLayer and OpenStack - present and future
Michael Fork - Cloud Architect
IBM and OpenStack: Enabling Enterprise Cloud Solutions Now.
Tammy Van Hove -Distinguished Engineer, Software Defined Systems
IBM Sponsored Sessions
26 © 2014 IBM Corporation
IBM Technical Sessions
Monday, May 12
3:40 - 4:20
3:40 - 4:20
Tuesday, May 13
11:15 - 11:55
2:00 - 2:40
5:30 - 6:10
5:30 - 6:10
Wednesday, May14
9:50 - 10:30
2:40 - 3:20
Thursday, May 15
9:50 - 10:30
1:30 - 2:10
2:20 - 3:00
27
Be sure to stop by the IBM booth to see some demos
and get your rockin’ OpenStack T-shirt while they last.
Thank you !

Mais conteúdo relacionado

Mais procurados

Deploy an Elastic, Resilient, Load-Balanced Cluster in 5 Minutes with Senlin
Deploy an Elastic, Resilient, Load-Balanced Cluster in 5 Minutes with SenlinDeploy an Elastic, Resilient, Load-Balanced Cluster in 5 Minutes with Senlin
Deploy an Elastic, Resilient, Load-Balanced Cluster in 5 Minutes with SenlinQiming Teng
 
Mirantis OpenStack-DC-Meetup 17 Sept 2014
Mirantis OpenStack-DC-Meetup 17 Sept 2014Mirantis OpenStack-DC-Meetup 17 Sept 2014
Mirantis OpenStack-DC-Meetup 17 Sept 2014Mirantis
 
Deep dive into highly available open stack architecture openstack summit va...
Deep dive into highly available open stack architecture   openstack summit va...Deep dive into highly available open stack architecture   openstack summit va...
Deep dive into highly available open stack architecture openstack summit va...Arthur Berezin
 
OpenStack KOREA 정기 세미나_OpenStack meet iNaaS SDN Controller
OpenStack KOREA 정기 세미나_OpenStack meet iNaaS SDN ControllerOpenStack KOREA 정기 세미나_OpenStack meet iNaaS SDN Controller
OpenStack KOREA 정기 세미나_OpenStack meet iNaaS SDN ControllerYongyoon Shin
 
What's new in OpenStack Liberty
What's new in OpenStack LibertyWhat's new in OpenStack Liberty
What's new in OpenStack LibertyStephen Gordon
 
Chef and OpenStack Workshop from ChefConf 2013
Chef and OpenStack Workshop from ChefConf 2013Chef and OpenStack Workshop from ChefConf 2013
Chef and OpenStack Workshop from ChefConf 2013Matt Ray
 
OpenStack Nova Scheduler
OpenStack Nova Scheduler OpenStack Nova Scheduler
OpenStack Nova Scheduler Peeyush Gupta
 
OpenStack HA
OpenStack HAOpenStack HA
OpenStack HAtcp cloud
 
OpenStack Nova - Developer Introduction
OpenStack Nova - Developer IntroductionOpenStack Nova - Developer Introduction
OpenStack Nova - Developer IntroductionJohn Garbutt
 
Orchestration tool roundup kubernetes vs. docker vs. heat vs. terra form vs...
Orchestration tool roundup   kubernetes vs. docker vs. heat vs. terra form vs...Orchestration tool roundup   kubernetes vs. docker vs. heat vs. terra form vs...
Orchestration tool roundup kubernetes vs. docker vs. heat vs. terra form vs...Nati Shalom
 
Role of sdn controllers in open stack
Role of sdn controllers in open stackRole of sdn controllers in open stack
Role of sdn controllers in open stackopenstackindia
 
Container Orchestration
Container OrchestrationContainer Orchestration
Container Orchestrationdfilppi
 
OpenStack Magnum 2016-08-04
OpenStack Magnum 2016-08-04OpenStack Magnum 2016-08-04
OpenStack Magnum 2016-08-04Adrian Otto
 
Neutron high availability open stack architecture openstack israel event 2015
Neutron high availability  open stack architecture   openstack israel event 2015Neutron high availability  open stack architecture   openstack israel event 2015
Neutron high availability open stack architecture openstack israel event 2015Arthur Berezin
 
Guts & OpenStack migration
Guts & OpenStack migrationGuts & OpenStack migration
Guts & OpenStack migrationopenstackindia
 
Scaling OpenStack Networking Beyond 4000 Nodes with Dragonflow - Eshed Gal-Or...
Scaling OpenStack Networking Beyond 4000 Nodes with Dragonflow - Eshed Gal-Or...Scaling OpenStack Networking Beyond 4000 Nodes with Dragonflow - Eshed Gal-Or...
Scaling OpenStack Networking Beyond 4000 Nodes with Dragonflow - Eshed Gal-Or...Cloud Native Day Tel Aviv
 
Open stack in action enovance-quantum in action
Open stack in action enovance-quantum in actionOpen stack in action enovance-quantum in action
Open stack in action enovance-quantum in actioneNovance
 

Mais procurados (19)

Deploy an Elastic, Resilient, Load-Balanced Cluster in 5 Minutes with Senlin
Deploy an Elastic, Resilient, Load-Balanced Cluster in 5 Minutes with SenlinDeploy an Elastic, Resilient, Load-Balanced Cluster in 5 Minutes with Senlin
Deploy an Elastic, Resilient, Load-Balanced Cluster in 5 Minutes with Senlin
 
Mirantis OpenStack-DC-Meetup 17 Sept 2014
Mirantis OpenStack-DC-Meetup 17 Sept 2014Mirantis OpenStack-DC-Meetup 17 Sept 2014
Mirantis OpenStack-DC-Meetup 17 Sept 2014
 
Deep dive into highly available open stack architecture openstack summit va...
Deep dive into highly available open stack architecture   openstack summit va...Deep dive into highly available open stack architecture   openstack summit va...
Deep dive into highly available open stack architecture openstack summit va...
 
OpenStack KOREA 정기 세미나_OpenStack meet iNaaS SDN Controller
OpenStack KOREA 정기 세미나_OpenStack meet iNaaS SDN ControllerOpenStack KOREA 정기 세미나_OpenStack meet iNaaS SDN Controller
OpenStack KOREA 정기 세미나_OpenStack meet iNaaS SDN Controller
 
What's new in OpenStack Liberty
What's new in OpenStack LibertyWhat's new in OpenStack Liberty
What's new in OpenStack Liberty
 
Chef and OpenStack Workshop from ChefConf 2013
Chef and OpenStack Workshop from ChefConf 2013Chef and OpenStack Workshop from ChefConf 2013
Chef and OpenStack Workshop from ChefConf 2013
 
OpenStack Nova Scheduler
OpenStack Nova Scheduler OpenStack Nova Scheduler
OpenStack Nova Scheduler
 
OpenStack HA
OpenStack HAOpenStack HA
OpenStack HA
 
OpenStack Nova - Developer Introduction
OpenStack Nova - Developer IntroductionOpenStack Nova - Developer Introduction
OpenStack Nova - Developer Introduction
 
Openstack nova
Openstack novaOpenstack nova
Openstack nova
 
Orchestration tool roundup kubernetes vs. docker vs. heat vs. terra form vs...
Orchestration tool roundup   kubernetes vs. docker vs. heat vs. terra form vs...Orchestration tool roundup   kubernetes vs. docker vs. heat vs. terra form vs...
Orchestration tool roundup kubernetes vs. docker vs. heat vs. terra form vs...
 
Role of sdn controllers in open stack
Role of sdn controllers in open stackRole of sdn controllers in open stack
Role of sdn controllers in open stack
 
Container Orchestration
Container OrchestrationContainer Orchestration
Container Orchestration
 
OpenStack Magnum 2016-08-04
OpenStack Magnum 2016-08-04OpenStack Magnum 2016-08-04
OpenStack Magnum 2016-08-04
 
Neutron high availability open stack architecture openstack israel event 2015
Neutron high availability  open stack architecture   openstack israel event 2015Neutron high availability  open stack architecture   openstack israel event 2015
Neutron high availability open stack architecture openstack israel event 2015
 
Guts & OpenStack migration
Guts & OpenStack migrationGuts & OpenStack migration
Guts & OpenStack migration
 
Scaling OpenStack Networking Beyond 4000 Nodes with Dragonflow - Eshed Gal-Or...
Scaling OpenStack Networking Beyond 4000 Nodes with Dragonflow - Eshed Gal-Or...Scaling OpenStack Networking Beyond 4000 Nodes with Dragonflow - Eshed Gal-Or...
Scaling OpenStack Networking Beyond 4000 Nodes with Dragonflow - Eshed Gal-Or...
 
Topologies of OpenStack
Topologies of OpenStackTopologies of OpenStack
Topologies of OpenStack
 
Open stack in action enovance-quantum in action
Open stack in action enovance-quantum in actionOpen stack in action enovance-quantum in action
Open stack in action enovance-quantum in action
 

Semelhante a High Availability in OpenStack Cloud

IBM MQ - High Availability and Disaster Recovery
IBM MQ - High Availability and Disaster RecoveryIBM MQ - High Availability and Disaster Recovery
IBM MQ - High Availability and Disaster RecoveryMarkTaylorIBM
 
OpenStackを利用したEnterprise Cloudを支える技術 - OpenStack最新情報セミナー 2016年5月
OpenStackを利用したEnterprise Cloudを支える技術 - OpenStack最新情報セミナー 2016年5月OpenStackを利用したEnterprise Cloudを支える技術 - OpenStack最新情報セミナー 2016年5月
OpenStackを利用したEnterprise Cloudを支える技術 - OpenStack最新情報セミナー 2016年5月VirtualTech Japan Inc.
 
IBM Integration Bus & WebSphere MQ - High Availability & Disaster Recovery
IBM Integration Bus & WebSphere MQ - High Availability & Disaster RecoveryIBM Integration Bus & WebSphere MQ - High Availability & Disaster Recovery
IBM Integration Bus & WebSphere MQ - High Availability & Disaster RecoveryRob Convery
 
IBM MQ High Availabillity and Disaster Recovery (2017 version)
IBM MQ High Availabillity and Disaster Recovery (2017 version)IBM MQ High Availabillity and Disaster Recovery (2017 version)
IBM MQ High Availabillity and Disaster Recovery (2017 version)MarkTaylorIBM
 
Ame 2269 ibm mq high availability
Ame 2269 ibm mq high availabilityAme 2269 ibm mq high availability
Ame 2269 ibm mq high availabilityAndrew Schofield
 
[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus
[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus
[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with PrometheusOpenStack Korea Community
 
Accelerating Public Cloud Migration with Multi-Cloud Load Balancing
Accelerating Public Cloud Migration with Multi-Cloud Load BalancingAccelerating Public Cloud Migration with Multi-Cloud Load Balancing
Accelerating Public Cloud Migration with Multi-Cloud Load BalancingAvi Networks
 
F5 Meetup presentation automation 2017
F5 Meetup presentation automation 2017F5 Meetup presentation automation 2017
F5 Meetup presentation automation 2017Guy Brown
 
Cloud stack troubleshooting
Cloud stack troubleshooting Cloud stack troubleshooting
Cloud stack troubleshooting AlexTian
 
PLNOG15: Practical deployments of Kea, a high performance scalable DHCP - Tom...
PLNOG15: Practical deployments of Kea, a high performance scalable DHCP - Tom...PLNOG15: Practical deployments of Kea, a high performance scalable DHCP - Tom...
PLNOG15: Practical deployments of Kea, a high performance scalable DHCP - Tom...PROIDEA
 
Secure Remote Access to AWS: Why OpenVPN & Jump Hosts Aren’t Enough
Secure Remote Access to AWS: Why OpenVPN & Jump Hosts Aren’t EnoughSecure Remote Access to AWS: Why OpenVPN & Jump Hosts Aren’t Enough
Secure Remote Access to AWS: Why OpenVPN & Jump Hosts Aren’t EnoughKhash Nakhostin
 
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...confluent
 
OpenStack State of Fibre Channel
OpenStack State of Fibre ChannelOpenStack State of Fibre Channel
OpenStack State of Fibre Channelhemna6969
 
Practice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China MobilePractice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China MobileDataWorks Summit
 
Automated Deployment and Management of Edge Clouds
Automated Deployment and Management of Edge CloudsAutomated Deployment and Management of Edge Clouds
Automated Deployment and Management of Edge CloudsJay Bryant
 
WebSphere Technical University: Top WebSphere Problem Determination Features
WebSphere Technical University: Top WebSphere Problem Determination FeaturesWebSphere Technical University: Top WebSphere Problem Determination Features
WebSphere Technical University: Top WebSphere Problem Determination FeaturesChris Bailey
 
IBM MQ Disaster Recovery
IBM MQ Disaster RecoveryIBM MQ Disaster Recovery
IBM MQ Disaster RecoveryMarkTaylorIBM
 
Example of One of my Desgins for Cyber &Networking Solutions for Customers ...
Example of One  of my Desgins  for Cyber &Networking Solutions for Customers ...Example of One  of my Desgins  for Cyber &Networking Solutions for Customers ...
Example of One of my Desgins for Cyber &Networking Solutions for Customers ...chen sheffer
 
Tokyo azure meetup #12 service fabric internals
Tokyo azure meetup #12   service fabric internalsTokyo azure meetup #12   service fabric internals
Tokyo azure meetup #12 service fabric internalsTokyo Azure Meetup
 

Semelhante a High Availability in OpenStack Cloud (20)

IBM MQ - High Availability and Disaster Recovery
IBM MQ - High Availability and Disaster RecoveryIBM MQ - High Availability and Disaster Recovery
IBM MQ - High Availability and Disaster Recovery
 
OpenStackを利用したEnterprise Cloudを支える技術 - OpenStack最新情報セミナー 2016年5月
OpenStackを利用したEnterprise Cloudを支える技術 - OpenStack最新情報セミナー 2016年5月OpenStackを利用したEnterprise Cloudを支える技術 - OpenStack最新情報セミナー 2016年5月
OpenStackを利用したEnterprise Cloudを支える技術 - OpenStack最新情報セミナー 2016年5月
 
IBM Integration Bus & WebSphere MQ - High Availability & Disaster Recovery
IBM Integration Bus & WebSphere MQ - High Availability & Disaster RecoveryIBM Integration Bus & WebSphere MQ - High Availability & Disaster Recovery
IBM Integration Bus & WebSphere MQ - High Availability & Disaster Recovery
 
IBM MQ High Availabillity and Disaster Recovery (2017 version)
IBM MQ High Availabillity and Disaster Recovery (2017 version)IBM MQ High Availabillity and Disaster Recovery (2017 version)
IBM MQ High Availabillity and Disaster Recovery (2017 version)
 
Ame 2269 ibm mq high availability
Ame 2269 ibm mq high availabilityAme 2269 ibm mq high availability
Ame 2269 ibm mq high availability
 
[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus
[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus
[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus
 
Accelerating Public Cloud Migration with Multi-Cloud Load Balancing
Accelerating Public Cloud Migration with Multi-Cloud Load BalancingAccelerating Public Cloud Migration with Multi-Cloud Load Balancing
Accelerating Public Cloud Migration with Multi-Cloud Load Balancing
 
F5 Meetup presentation automation 2017
F5 Meetup presentation automation 2017F5 Meetup presentation automation 2017
F5 Meetup presentation automation 2017
 
Cloud stack troubleshooting
Cloud stack troubleshooting Cloud stack troubleshooting
Cloud stack troubleshooting
 
PLNOG15: Practical deployments of Kea, a high performance scalable DHCP - Tom...
PLNOG15: Practical deployments of Kea, a high performance scalable DHCP - Tom...PLNOG15: Practical deployments of Kea, a high performance scalable DHCP - Tom...
PLNOG15: Practical deployments of Kea, a high performance scalable DHCP - Tom...
 
Secure Remote Access to AWS: Why OpenVPN & Jump Hosts Aren’t Enough
Secure Remote Access to AWS: Why OpenVPN & Jump Hosts Aren’t EnoughSecure Remote Access to AWS: Why OpenVPN & Jump Hosts Aren’t Enough
Secure Remote Access to AWS: Why OpenVPN & Jump Hosts Aren’t Enough
 
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
 
OpenStack State of Fibre Channel
OpenStack State of Fibre ChannelOpenStack State of Fibre Channel
OpenStack State of Fibre Channel
 
F5 TMOS v13.0
F5 TMOS v13.0F5 TMOS v13.0
F5 TMOS v13.0
 
Practice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China MobilePractice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China Mobile
 
Automated Deployment and Management of Edge Clouds
Automated Deployment and Management of Edge CloudsAutomated Deployment and Management of Edge Clouds
Automated Deployment and Management of Edge Clouds
 
WebSphere Technical University: Top WebSphere Problem Determination Features
WebSphere Technical University: Top WebSphere Problem Determination FeaturesWebSphere Technical University: Top WebSphere Problem Determination Features
WebSphere Technical University: Top WebSphere Problem Determination Features
 
IBM MQ Disaster Recovery
IBM MQ Disaster RecoveryIBM MQ Disaster Recovery
IBM MQ Disaster Recovery
 
Example of One of my Desgins for Cyber &Networking Solutions for Customers ...
Example of One  of my Desgins  for Cyber &Networking Solutions for Customers ...Example of One  of my Desgins  for Cyber &Networking Solutions for Customers ...
Example of One of my Desgins for Cyber &Networking Solutions for Customers ...
 
Tokyo azure meetup #12 service fabric internals
Tokyo azure meetup #12   service fabric internalsTokyo azure meetup #12   service fabric internals
Tokyo azure meetup #12 service fabric internals
 

Mais de Qiming Teng

202203-技术沙龙-k8s-v1.pptx
202203-技术沙龙-k8s-v1.pptx202203-技术沙龙-k8s-v1.pptx
202203-技术沙龙-k8s-v1.pptxQiming Teng
 
Senlin deep dive 2016
Senlin deep dive 2016Senlin deep dive 2016
Senlin deep dive 2016Qiming Teng
 
Autoscaling with magnum and senlin
Autoscaling with magnum and senlinAutoscaling with magnum and senlin
Autoscaling with magnum and senlinQiming Teng
 
VM HA and Cross-Region Scaling
VM HA and Cross-Region ScalingVM HA and Cross-Region Scaling
VM HA and Cross-Region ScalingQiming Teng
 
Managing Container Clusters in OpenStack Native Way
Managing Container Clusters in OpenStack Native WayManaging Container Clusters in OpenStack Native Way
Managing Container Clusters in OpenStack Native WayQiming Teng
 
Senlin deep dive 2015 05-20
Senlin deep dive 2015 05-20Senlin deep dive 2015 05-20
Senlin deep dive 2015 05-20Qiming Teng
 

Mais de Qiming Teng (6)

202203-技术沙龙-k8s-v1.pptx
202203-技术沙龙-k8s-v1.pptx202203-技术沙龙-k8s-v1.pptx
202203-技术沙龙-k8s-v1.pptx
 
Senlin deep dive 2016
Senlin deep dive 2016Senlin deep dive 2016
Senlin deep dive 2016
 
Autoscaling with magnum and senlin
Autoscaling with magnum and senlinAutoscaling with magnum and senlin
Autoscaling with magnum and senlin
 
VM HA and Cross-Region Scaling
VM HA and Cross-Region ScalingVM HA and Cross-Region Scaling
VM HA and Cross-Region Scaling
 
Managing Container Clusters in OpenStack Native Way
Managing Container Clusters in OpenStack Native WayManaging Container Clusters in OpenStack Native Way
Managing Container Clusters in OpenStack Native Way
 
Senlin deep dive 2015 05-20
Senlin deep dive 2015 05-20Senlin deep dive 2015 05-20
Senlin deep dive 2015 05-20
 

Último

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park masabamasaba
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456KiaraTiradoMicha
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...Jittipong Loespradit
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...Shane Coughlan
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfayushiqss
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfonteinmasabamasaba
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...kalichargn70th171
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 

Último (20)

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 

High Availability in OpenStack Cloud

  • 1. 1 OpenStack Summit May 12-16, 2014 Atlanta, Georgia Enhancing High Availability in the Context of OpenStack Qiming Teng tengqim@cn.ibm.com IBM Research
  • 2. 2 © 2014 IBM Corporation Agenda  High Availability (HA) Overview  Four Types of HA in OpenStack  OpenStack HA  VM/Application HA Options ‒ VM/App HA Orchestrated ‒ Open Questions  HA as a Service?
  • 3. 3 © 2014 IBM Corporation High Availability Overview  Why HA? ‒ Single system • Hardware failures • Hypervisor defects • OS (host/guest) crashes • Application bugs ‒ In cloud • Shared, virtualized storage • Shared, virtualized network  Use cases ‒ Server consolidation in private cloud ‒ Selling point for public cloud ‒ Ease of management • Planned/unplanned downtime ‒ (potentially) a user consumable service
  • 4. 4 © 2014 IBM Corporation How to Achieve HA?  Three Technologies ‒ Redundancy • Capacity Planning • Cost ‒ Detection • Watchdog • Heartbeat messages ‒ Recovery • Transparency • Data consistency • Interruption time  Implications ‒ Automatic ‒ Autonomous OPERATING FAILURE RECOVERING
  • 5. 5 © 2014 IBM Corporation Four Types of High Availability in an OpenStack Cloud Host OpenStack VM Application • Physical nodes • Physical network • Physical storage • Hypervisor • Host OS • … • Compute Controller • Network Controller • Database • Message Queue • Storage • … • Service Resiliency • Quality of Service • Cost • Transparency • Data Integrity • ... • Virtual Machine • Incl. Container • Virtual Network • Virtual Storage • VM Mobility • Ease of Management • ...
  • 6. 6 © 2014 IBM Corporation OpenStack HA: Deployment Pattern  Main Focus ‒ Avoid SPOF (Single Point of Failure) in OpenStack services • Controller, Network, Compute, Swift, etc. ‒ Stateful versus Stateless services  Implementation ‒ Primarily based on Pacemaker/Corosync Linux-HA stack, plus a load-balancer ‒ Keepalived/haproxy  A Deployment Pattern, not part of OpenStack core components ‒ HA Guide documentation ‒ Chef cookbooks ‒ TripleO elements  Only deployment, no runtime management service
  • 7. 7 © 2014 IBM Corporation An example setup (RDO) Following chart are closing charts to promote IBM sessions, demos, etcetera
  • 8. 8 © 2014 IBM Corporation OpenStack HA: Intrinsic Supports  Nova ‒ Host Aggregates ‒ Availability Zones ‒ Service Groups • Internal heartbeat messages, zookeeper/memcached/matchmaker ‒ …  Message Queues ‒ QPID heartbeats (60 seconds interval) ‒ ZeroMQ w/ MatchMaker  Cinder ‒ Storwize driver (heartbeat: 10 seconds) ‒ Contrib services  Swift  ...
  • 9. 9 © 2014 IBM Corporation OpenStack HA: Internal Heartbeats [tengqm@node1 ~]$ nova service-list +----+------------------+-------+----------+---------+-------+----------------------------+-----------------+ | Id | Binary | Host | Zone | Status | State | Updated_at | Disabled Reason | +----+------------------+-------+----------+---------+-------+----------------------------+-----------------+ | 1 | nova-conductor | node1 | internal | enabled | up | 2014-04-27T20:37:09.000000 | - | | 3 | nova-cert | node1 | internal | enabled | up | 2014-04-27T20:37:05.000000 | - | | 4 | nova-scheduler | node1 | internal | enabled | up | 2014-04-27T20:37:06.000000 | - | | 5 | nova-consoleauth | node1 | internal | enabled | up | 2014-04-27T20:37:05.000000 | - | | 6 | nova-compute | node1 | nova | enabled | up | 2014-04-27T20:37:06.000000 | - | +----+------------------+-------+----------+---------+-------+----------------------------+-----------------+ [tengqm@node1 ~]$ neutron agent-list Starting new HTTP connection (1): 9.186.106.171 Starting new HTTP connection (1): 9.186.106.171 +--------------------------------------+--------------------+-------+-------+----------------+ | id | agent_type | host | alive | admin_state_up | +--------------------------------------+--------------------+-------+-------+----------------+ | 0f9b8470-577e-4439-84f1-36ce92eac77d | Metadata agent | node1 | :-) | True | | 7ac10787-9a62-4a96-868f-bd90bb46d52b | L3 agent | node1 | :-) | True | | c89d0bac-8a41-44ee-8df0-389a9c8db428 | Open vSwitch agent | node1 | :-) | True | | e138db2d-bf3b-4ac2-89ab-50dbb8771a7b | DHCP agent | node1 | :-) | True | +--------------------------------------+--------------------+-------+-------+----------------+
  • 10. 10 © 2014 IBM Corporation VM/Application HA: Guest Clusters hardware Host OS / Hypervisor App App Guest OS App App Guest OS App App Guest OS App App Guest OS nova neutron cinder glance … heat SCs/SDs hardware Host OS / Hypervisor
  • 11. 11 © 2014 IBM Corporation VM/Application HA Timeline (reboot VM-2) 21:22:56 rgmanager Shutting down rgmanager Stopping service load-balancer rgmanager [script] Executing /etc/init.d/lbsvc stop 21:23:00 21:23:03 21:23:05 21:23:06 21:23:07 rgmanager [ip] Removing IPv4 address 10.0.2.212/24 from eth1 rgmanager Service load-balancer is stopped rgmanager Resource groups locked; not evaluating [CPG] got procleave message from node 2 (1) dlm_controld: stop distributed lock manager (2) 21:23:04 rgmanager Dbus Released rgmanager Stopped 1 service rgmanager Disconnecting from CMAN rgmanager Pausing to allow services to start on other nodes [CPG] got procleave message from node 2 rgmanager Member 2 shutting down dlm_controld: stop distributed lock manager (3) rgmanager Evaluating load-balancer, stopped, owner none rgmanager event (0:2:0) Processed rgmanager Starting stopped service load-balancer rgmanager [ip] Link for eth1: Detected rgmanager [ip] Adding IP addr 10.0.2.212/24 to eth1 rgmanager [ip] Pinging addr 10.0.2.212 from dev eth1 rgmanager Event: Port Closed rgmanager Node 2 is not listening rgmanager [ip] Sending gratuitous ARP: 10.0.2.212 fa:16:3e:b4:69:b8 brd ff:ff:ff:ff:ff:ff rgmanager [script] Executing /etc/init.d/lbsvc start 21:23:08 rgmanager Service load-balancer started rgmanager 1 events processed VM-2(rebooted)VM-1
  • 12. 12 © 2014 IBM Corporation VM/Application HA: Guest Clusters App App Guest OS App App Guest OS App App Guest OS App App Guest OS nova neutron cinder glance … heat SCs/SDs Service X Image from Dept A Application Y Image from Dept B Service X Image from Dept A Service Z Image from Dept C hardware Host OS / Hypervisor hardware Host OS / Hypervisor LIMITATIONS - Ease of management - Application Specific - Intrusive
  • 13. 13 © 2014 IBM Corporation VM/Application HA: Intrinsic Supports  Redundancy ‒ Nova • Server Groups • Virtual Ensembles ? • Virtual Clusters ? ‒ Heat • InstanceGroup resource • ResourceGroup resource  Detection ‒ RPC notification, oslo.messaging ‒ Ceilometer  Recovery ‒ Fencing support in nova, cinder, neutron [undergoing] ‒ VM reboot, rebuild, evacuation … ‒ OS::Heat::HARestarter resource in Heat (deprecating) ‒ … Ceilometer Notification
  • 14. 14 © 2014 IBM Corporation VM/Application HA: Heat Orchestrated – yesterday Nova Server heat-cfntools heat-api-cloudwatch Alarm heat-engine HARestarter heat-api-cfn Template
  • 15. 15 © 2014 IBM Corporation VM/Application HA: Heat Orchestrated – yesterday Nova Server crond cfn-hup cfn-push-stats heat-api-cloudwatch Alarm heat-engine create_watch_data Watch_rule HARestarter restart() delete; create BOTO heat-api-cfnBOTO MQ
  • 16. 16 © 2014 IBM Corporation VM/Application HA: Heat Orchestrated – today Nova Server os-collect-config cfn-push-stats ceilometer Alarm heat-engine Alarm BOTO nova MQ metadata heat-api-cfnMQHARestarter restart() delete; create heat-api-cloudwatch MQ create_watch_data metadata HTTP
  • 17. 17 © 2014 IBM Corporation VM/Application HA: Heat Orchestrated – tomorrow? Nova Server os-collect-config ??? ceilometer Alarm heat-engine Alarm MQnova MQ metadata VMCluster “Restart” metadata HTTP notification MQ 1 Application Heartbeats 2 VM Heartbeats 3 A native signal / alarm 4 A notion of VM groups for • VM/Application redundancy • HA policies 5 Native support VM and application recovery: • reboot • rebuild • migrate • remote-restart
  • 18. 18 © 2014 IBM Corporation VM/Application HA: Open Questions  Physical placement of VMs ‒ No shared PDU/rack, no shared network switch ‒ HA-aware scheduling, e.g. server priority  Detection of failures ‒ High availability and QoS, e.g. desired latency/throughput versus reality ‒ Reliable detection, application involvement, …  Reasoning of failures ‒ Root cause, trend analysis ‒ Log collection and analysis  HA management / orchestration ‒ As a cross-cutting concern, involving not only compute, but also storage and network • Stack availability? ‒ Capacity planning / reservation  Leverage existing HA software ‒ Can we leverage supports from hypervisors? ‒ Can/should we generalize this into a service?
  • 19. 19 © 2014 IBM Corporation High Availability as a Service (HAaaS)  Generic HA management service ‒ Applicable to different levels of HA • Host, VM, App, OpenStack ‒ Applicable to different hypervisors • vSphere, KVM, Xen, HyperV, PowerVM, …) ‒ Functionality determined via user authentication  Well-defined service APIs ‒ Clusters management ‒ Application/Service resource definition ‒ HA policies • Fail-over domain • Fail-over priority, operation, timeout, retries, …
  • 20. 20 © 2014 IBM Corporation HAaaS: OpenStack HA VM or Physical Servers have have
  • 21. 21 © 2014 IBM Corporation HAaaS: VM HA Physical Server have have
  • 22. 22 © 2014 IBM Corporation HAaaS: Application HA VMs have have
  • 23. 23 © 2014 IBM Corporation Service Interfaces (1)  Cluster Management ‒ cluster_create, ‒ cluster_destroy, ‒ cluster_start, ‒ cluster_stop, ‒ cluster_suspend, ‒ cluster_resume, ‒ cluster_set_attr, ‒ cluster_get_attr, ‒ cluster_by_host, ‒ cluster_get_status, ‒ cluster_get_log, ‒ ...  Node Management (physical/virtual) ‒ node_join_cluster ‒ node_leave_cluster ‒ node_get_attr ‒ node_set_attr ‒ node_startup ‒ node_shutdown ‒ node_reboot ‒ node_evacuate ‒ node_get_status ‒ ...
  • 24. 24 © 2014 IBM Corporation Service Interfaces (2)  Resource Management ‒ resource_create ‒ resource_destroy ‒ resource_get_attr ‒ resource_set_attr ‒ ...  Fencing Management ‒ Fencing_dev_add ‒ Fencing_dev_del ‒ Fencing_dev_associate ‒ Fencing_dev_deassociate ‒ Fencing_dev_set_opts ‒ Fencing_dev_get_opts ‒ ...  Service Management (aka. resource groups) ‒ service_create ‒ service_destroy ‒ service_add_resource ‒ service_del_resource ‒ service_list ‒ service_get_attr ‒ service_set_attr ‒ service_start ‒ service_stop ‒ service_restart ‒ service_relocate ‒ ...
  • 25. 25 © 2014 IBM Corporation Monday, May 12 – Room B314 12:05-12:45 Wednesday, May 14 - Room B312 9:00-9:40 9:50-10:30 11:00-11:40 11:50-12:30 OpenStack is Rockin’ the OpenCloud Movement! Who‘s Next to Join the Band ? Angel Diaz, VP Open Technology and Cloud Labs David Lindquist, IBM Fellow, VP, CTO Cloud & Smarter Infrastructure Getting from enterprise ready to enterprise bliss - why OpenStack and IBM is a match made in Cloud heaven. Todd Moore - Director, Open Technologies and Partnerships Taking OpenStack beyond Infrastructure with IBM SmartCloud Orchestrator. Andrew Trossman - Distinguished Engineer, IBM Common Cloud Stack and SmartCloud Orchestrator IBM, SoftLayer and OpenStack - present and future Michael Fork - Cloud Architect IBM and OpenStack: Enabling Enterprise Cloud Solutions Now. Tammy Van Hove -Distinguished Engineer, Software Defined Systems IBM Sponsored Sessions
  • 26. 26 © 2014 IBM Corporation IBM Technical Sessions Monday, May 12 3:40 - 4:20 3:40 - 4:20 Tuesday, May 13 11:15 - 11:55 2:00 - 2:40 5:30 - 6:10 5:30 - 6:10 Wednesday, May14 9:50 - 10:30 2:40 - 3:20 Thursday, May 15 9:50 - 10:30 1:30 - 2:10 2:20 - 3:00
  • 27. 27 Be sure to stop by the IBM booth to see some demos and get your rockin’ OpenStack T-shirt while they last. Thank you !