2. Introduction
2
CLOUDSTACK – JOURNEY TO A NEXT GEN CLOUD
ABOUT ME!
Director Platform Infrastructure Remit: Infrastructure that runs majority
of the Ticketmaster platform in
International
Ticketmaster was born in Arizona in
1976, now the world’s largest ticketing
provider in over 30 countries
4. PEAKY TRAFFIC
4
A PROBLEM
We had:
• Lots of traffic during peak times
• Bare-metal capacity going under used at
other times
• Customer service issues with existing
traffic
5. INVESTIGATION
5
A PROBLEM
How do we get tens of thousands of customers an
hour onto our platform whilst maintaining stability?
OPTIONS:
6. An Implementation
6
A PROBLEM
Tied into our internal authentication system and DNS Services
Simple setup on the network; internal code so easy to
update/upgrade
Allows us to integrate Load Balancers into the platform as well
at L2
Over a number of months a self service portal was created
to allow internal users the ability to spin up instances on
the fly
Advantages
Not a real cloud solution
No API (would require extensive code updates)
Reliant on single threaded code; often requires small updates
to fix after patching
Disadvantages
9. PEAKY TRAFFIC
9
NOW
We have:
• Lots of traffic during peak times
• No way to service it effectively with our
virtualization platform; no API to build
on demand or expand easily.
• Some legacy code bases are not suited
to going into EC2 yet
10. INVESTIGATION
10
NOW
How do we get hundreds of thousands of customers
an hour onto our platform whilst maintaining stability?
OPTIONS:
11. Another Implementation
We took some of the best of breed Open Source solutions and tested them out:
11
NOW
Virtualisation platform
not a cloud
Lacking features but a
very promising solution
Massive system but hard
implementation with
very steep learning curve
CloudStack
Monolithic app but
provides everything we
need
12. Another set of issues
12
NOW
LDAP Authentication issues
L3 Network Implementation
Zone setups in the beginning (VLAN vs VxLAN)
The road to CloudStack has had issues:
Storage issues (iSCSI vs NFS)
Interesting KVM foibles
Lack of HA Host support for iSCSI primary storage
But we’ve stuck to our goal and have had our platform
running for nearly a year now!
13. Our Solution
We have two Production Clouds in two separated data-centers:
13
NOW
Master / Master
Distinctly separated – issues
in one allows us to switch
loads
Java API Extensions
To facilitate the migration
from our legacy Xen “Cloud”
Runtime JAR
Talks to DNS, Inventory and
legacy Xen Cloud
Legacy VM Destroyed
Once process is complete
14. Provisioning
As with all Clouds automation is the key to management:
14
NOW
Ansible
Provisioning of hardware
assets
Automation Systems
Python, Ansible,
Terraform
Network tenants
Created a python system
that pulls its
configuration from a
YAML config
Ansible AWX
To push out updates to
Hypervisors, Controllers
and some VMs
15. Monitoring
15
NOW
Storage backend
CloudStack exporter
MySQL exporter
During the CloudStack build out we also had to deprecate our legacy
monitoring platform.
New platform based on Prometheus; Multiple monitoring plugins and
exporters:
OVS exporter
Lots of extra monitoring components that don’t come with
the cloudstack Prometheus endpoint.
17. What features would we like to see next?
•Serverless/Lambda style integration (OpenWhisk?)
•Better Host HA Support
•More pluggable infrastructure for adding LB/FW etc components
•2FA Support
•Hyper-V Support
17
QUESTIONS