This is the presentation from the OpenStack Hong Kong Conference from Fall 2013.
There are many different blueprints describing how high-availability can be achieved underneith an OpenStack cloud. At PayPal, we have chosen to utilize some of the common OpenStack best practices as well as introducing common Data Center best practices to bring high availability to the management/control infrastructure within our cloud. Topics Included: Design of our Openstack Control infrastructure Pros and Cons of management and infrastructure racks separate from a compute rack High Availability requirements by component Pros and cons of High Availability choices external to and within the cloud Trade-offs that need to be made now to ensure availability
http://www.openstack.org/summit/openstack-summit-hong-kong-2013/session-videos/presentation/openstack-high-availability-paypal
2. ABOUT PAYPAL
PayPal offers flexible and innovative payment solutions for consumers
and merchants of all sizes.
• 137,000,000 users
• $300,000 payments processed
each minute
• 193 markets / 26 currencies
• The World‟s Most Widely Used Digital Wallet
2
3. AGENDA
Why HA is important for PayPal?
Our Learning
Our Solution
What is not solved?
Q&A
3
4. WHY HA IS IMPORTANT?
“no perceived downtime” for cloud users
Enterprise Class
Auto Scaling & Flex up/down can never break
API Integrations always succeed
Everyone expected to use the cloud
4
5. AVAILABILITY REQUIREMENTS
No SPOF “Under the Cloud”
Scale Across the Data Center(s)
Scale Across Racks & Containers
Respect natural availability zones within the data centers
No „cloud‟ can impact any other „cloud‟
5
6. INFRASTRUCTURE RACK
Layer 2
versus
Layer 3
10g
Active
10g
Passive
1g
Mgmt
Infrastructure / Controller Racks
10g
Passive
10g
Active
LB Passive
1g
Mgmt
6
10g
Active
Compute Racks …
10g
Passive
…
1g
Mgmt
1g
Mgmt
LB Active
10g
Passive
Access
10g
Active
Cattle
&
Puppies
7. INFRASTRUCTURE RACK
OpenStack Services are all VM on KVM
Every infra component resides on 2+ nodes
Redundant physical racks
Redundant power/switches in each rack
Layer-3 connectivity between racks (no Layer 2)
Enterprise Grade Physical LB (floating VIP)
7
8. COMPUTE
1
2
LB Active
Access
LB Passive
LB Active
LB Passive
3
1g
Mgmt
10g
Passive
10g
Active
1g
Mgmt
10g
Passive
10g
Active
1g
Mgmt
1g
Mgmt
10g
Passive
10g
Passive
10g
Active
10g
Active
10g
Active
10g
Passive
10g
Active
Compute Node
96 Hyperscale
16 Core
256GB Ram
1.1T Disk
1g
Mgmt
10g
Passive
10g
Active
10g
Active
Compute Node
96 Hyperscale
16 Core
256GB Ram
1.1T Disk
1g
Mgmt
10g
Passive
10g
Passive
8
1g
Mgmt
1g
Mgmt
Compute Node
96 Hyperscale
16 Core
256GB Ram
1.1T Disk
Compute Node
96 Hyperscale
16 Core
256GB Ram
1.1T Disk
10. swift storage node
swift storage node
swift storage node
OPENSTACK SERVICES
swift
swift-object
swift-container
swift-account
6000 / TCP
Browser
6001 / TCP
UDNS (DNSaas)
UDNS (DNSaas)
6002 / TCP
80 / TCP
quantum
Openstack Controller
Openstack Controller
Openstack Controller
9696 / TCP
80 / TCP
Quantum Server
Quantum Server
quantum-api
LBaas
LBaas
53 / TCP
10053 / TCP
22,80,443,161 /
TCP
161/ UDP
80 / TCP
DNS Master
F5 Load Balancer
Remedy API
httpd (dashboard)
443 / TCP
glance
9292 / TCP
9191 / TCP
openflow
6633 / TCP
mgmt port
6632 / TCP
35357 / TCP
5000 / TCP
8773 / TCP
8774 / TCP
NVP Service Node
NVP Service Node
NVP Service Node
8776 / TCP
8080 / TCP
glance-admin
glance-reg
8140 / TCP
F5 Load
Balancer
Puppet DB
61613 / TCP
Puppet VIP
keystone
keystone-admin
keystone-api
nova
nova-api
novametadata-api
novavolume-api
swift-proxy
3115 / TCP
Nicira NVP Controller
Nicira NVP Controller
Nicira NVP Controller
3115 / TCP
F5 Load
Balancer
xxxx / TCP
NVP Gateway
NVP Gateway
NVP Gateway
Compute Node
Hypervisor
MYSQL DB
MYSQL DB
mysql 5
nova
mq
OpenVswitch
ovs-vswitchd
ovsdb-server
puppet
Mongo DB
Mongo DB
mongo db
11. OPENSTACK CONSIDERATIONS
LB VIP for every service (unless it can‟t)
Connect to LB VIP, not individual nodes
Script to close Server Connections
Pacemaker only works inside a single Layer-2 (not a large enterprise)
Auto Restart using Monit
MySQL
Swift Cluster
11
13. CINDER SERVICES WORKFLOW
User request
(create volume)
1
Cinder API
2
AMPQ
5
Cinder Volume
6
Storage
Backend1
13
Cinder
Scheduler
3
Storage
Backend2
4
Figure shows a typical
interaction between
Cinder components to
serve a end user request.
(create new volume in
this example).
14. CINDER SERVICES WITH HA
User request
(create volume)
1
How HA is implemented for
Cinder Components:
Load Balancer
Cinder
Scheduler A
2
Cinder API A
Cinder
Scheduler B
Cinder API B
AMPQ
Cluster
3
4
5
Cinder Volume A
Cinder Volume B
6
14
Storage
Backend1
Storage
Backend2
• API (stateless) – Load Balancer
(A/A or A/P);
• Scheduler (stateless) –
Pacemaker, Queue itself (A/A or
A/P);
• Volume – Pacemaker, Queue
itself (A/A or A/P).
So a little bit about PayPal before we start, let’s quickly run through with some key details on what PayPal is and what we do.And we’re a payments company.You can think of PayPal as a digital wallet – one convenient, secure spot to keep all your ways to pay.And PayPal is not just on the internetfor you to send money to a friend, or buy something on eBay – along with numerous merchants that let you pay with PayPal online,we are also in-store, in places like Home Depot and GNC. And with this brick and mortar presence, you can leave your wallet at home, punch in your phone number and PIN code, and still buy something.And with payment innovations like that, we continue to grow, as these numbers show, 137m active users, 300,000 dollars worth of payments/min… this tells you that scale is important to us, and we scale on a global basis to meet theneeds of our customers worldwide, especially here in Asia.We’re talking about nearly 200markets and 26 currencies. We literally are the world’s most widely used digital wallet.
Shift from Enterprise design model to cloud-based designElastically scale and self-heal infrastructure to accommodate unpredictable usage patterns of customers and internet commerceSeparate rapidly iterating customer experiences from core servicesreduce overall cost per transaction within the environment
Infrastructure Rack only for Cloud Management GearCompute racks scale as far asIP addresses run outNeutron network(s) …NVP Gateway Limit …
Infrastructure Rack only for Cloud Management GearCompute racks scale as far asIP addresses run outNeutron network(s) …NVP Gateway Limit …
Two Entry Points for InfrastructurePayPal Product DevelopersCloud Operators to manage CloudCentrally Orchestrated using HeatLocal StorageHP 4X600 GB(MirrorCisco 4948 & Arista 7050Nicira NVPF5 10.2.2 LB
http://www.palominodb.com/blog/2012/12/10/benchmarking-ndb-vs-galeraMaria DBBottleneck on LB during Image transferHeat active/standby support, no active/active cluster
http://www.palominodb.com/blog/2012/12/10/benchmarking-ndb-vs-galeraMaria DBBottleneck on LB during Image transferHeat active/standby support, no active/active clusterCinder Volume Service doesn’t play well with load balancer and VIP.
Talk about cinder HA issuesVM Create issues due to failed Rabbit MQ message deliveryIssues in Upgrade without downtime for major versions rolloutNo Auto cleanup for stale DB rowsThe API Response is not consistent due to DB locks and DB Connection threads