Tata AIG General Insurance Company - Insurer Innovation Award 2024
Mark Marsiglio - Autoscaling with eZ in the Cloud - A Case Study
1. Autoscaling eZ in the Cloud
The Cloud is cool, but robots in the cloud are better
Mark Marsiglio, President/CEO, ThinkCreative
16 Jun 2011
Friday, June 17, 2011
2. Client & Developer Expectations
Typical vs. Cloud System Design
Auto-Scaling Cloud
Economics
Friday, June 17, 2011
3. Project Goals
Be more “cloudy”
Serve more pages, faster
Reduce hosting costs
Increase platform flexibility
Automate as much as possible
Be fully redundant, recover faster
Get woken up less by downtime alerts
Friday, June 17, 2011
4. Client Needs Dev Needs
Predictable cost Separate development
environments
Burst capacity
Easy deployment
Uptime
Uptime
Load speed
System stability
Managed service
Ability to recover from
Trust that it is “handled” human error quickly
Security Fast
Friday, June 17, 2011
5. Dedicated Servers
Typical Hosting Platform, Our Original Approach
Client Site Client Site Client Site Client Site Client Site Client Site
Client Site Client Site Client Site Client Site Client Site Client Site
Client Site Client Site Client Site Client Site Client Site Client Site
Big, fast Big, fast Big, fast
dedicated web dedicated web dedicated web
server 1 server 2 server 3
Apache/PHP Apache/PHP Apache/PHP
Local eZfind Local eZfind Local eZfind
Local File Sys Local File Sys Local File Sys
Local MySQL Local MySQL Local MySQL
backup tar file sent backup tar file sent backup tar file sent
to s3 for storage to s3 for storage to s3 for storage
Friday, June 17, 2011
6. Risk Reward
Many points of failure Pretty fast
Hard to restore Better than a shared
server
Slow download of backups
Otherwise, not much
Heavy backup load
Inability to backup large
sites (disk space)
Traffic surge can overwhelm
Capacity estimation
Friday, June 17, 2011
7. Dedicated Cloud Servers
Our 1st Generation Cloud Platform
Client Site Client Site Client Site Client Site Client Site Client Site
Client Site Client Site Client Site Client Site Client Site Client Site
Client Site Client Site Client Site Client Site Client Site Client Site
Big, fast dedicated Big, fast dedicated Big, fast dedicated
Amazon Amazon Amazon
Instance server 1 Instance server 2 Instance server 3
Apache/PHP Apache/PHP Apache/PHP
Local eZfind Local eZfind Local eZfind
Local File Sys Local File Sys Local File Sys
Local MySQL Local MySQL Local MySQL
Snapshots Snapshots Snapshots
Friday, June 17, 2011
8. vs. Original System
Good For Clients Bad
More problems vs.
High quality backups
dedicated servers
Better failure recovery
Still no failover
For Us
Lower cost
No automation
Easy to create more
Higher failure rate
instances
Friday, June 17, 2011
9. Reward Risk
Fast and frequent backup Many points of failure
snapshots
Kind of hard to restore
Super-fast backup
System design changes
restoration
are hard
Easy to create new
Large traffic surge can
instances as needed (AMI)
overwhelm
Technically, it’s in the cloud
Miscalculation of capacity
requires DNS hassle
Friday, June 17, 2011
10. New Cloud Hosting Platform
Auto-scaling array of single-purpose servers
Friday, June 17, 2011
11. Client Client Client Client
Client Site Client Site Client Site Client Site Client Site Client Site
Site
Client Site Site
Client Site
Client
Client Site Client Site Client Site Client Site Client Site Client Site Client Site
Site Site Site
Client Client Client
Client Site Client Site Client Site Client Site Client Site Client Site Client Site
Site Site Site
SSL SSL Elastic Load Balancers SSL SSL
Rightscript Big, fast Big, fast Big, fast Big, fast Big, fast Big, fast
powered web app web app web app web app Additional instances
web app web app
auto-
scaling and server 1 server 2 server 3 server 4 server etc based on
as needed, server 1
OS, Apache/PHP, OS, Apache/PHP, OS, Apache/PHP, OS, caches, OS, caches, OS, caches,
scheduled
min sizes synchronized synchronized synchronized synchronized traffic load, time of day
synchronized synchronized
apache config apache config apache config apache config apache config apache config
MySQL Master eZ Find/Forwarder NFS Server
MySQL Slave eZFind EBS RAID
site data, extensions,
Rightscale template, can be promoted Data store kernel, ini and logs
Snapshots Snapshots Snapshots
Friday, June 17, 2011
12. vs. 1st Gen System
Good For Clients Bad
Less downtime Single point of failure in
NFS filesystem
Failover systems
More expensive
Dev/staging/production
For Us
Automatic scaling
Higher cost of operation
Scripted instance launch
More complex arch.
High tolerance of failure
Friday, June 17, 2011
13. Current Design Notes
Amazon EC2, c1.medium, m1.large
Rightscale, scripted instance launching
Unique ELB for each SSL site
Approximately 5-7 servers running, 16gb/day, 30 req/sec
Array servers vote to scale, 2 new servers in ~4 mins
Array members first in-first out
Development/Staging/Production
Scripted deployment, version controlled
Friday, June 17, 2011
14. Current Design Limitations
NFS bottleneck
NFS single point of failure
Little CDN advantage
Unreliable sendmail email delivery
Limited data on per-client usage
No static IP for ELBs
And...
Friday, June 17, 2011
15. The Cloud is Falling!
Judgement Day
April 21, 2011
Friday, June 17, 2011
16. Single-cloud dependence
Planned for...
Instance failure
Availability zone failure
What if the entire AWS system fails?
Data store failure
Network connectivity failure
Database failure
Friday, June 17, 2011
18. Client Client Client Client Non-Client
Client Site Client Site Client Site Client Site Non-Client Site
Site
Client Site Site
Client Site
Client Site
Client Site Client Site Client Site Client Site Client Site Client Site Client Site
Site Site Site
Client Client Client
Client Site Client Site Client Site Client Site Client Site Client Site Client Site
Site Site Site
SSL SSL Elastic Load Balancers SSL SSL
Rightscript Big, fast Big, fast Big, fast Big, fast Big, fast Big, fast
powered web app web app web app web app Additional instances
web app web app Reporting/
auto- Analytics/
scaling and server 1 server 2 server 3 server 4 server etc based on
as needed, server 1 Log Analysis
OS, Apache/PHP, OS, Apache/PHP, OS, Apache/PHP, OS, caches, OS, caches, OS, caches,
scheduled
min sizes synchronized synchronized synchronized synchronized traffic load, time of day
synchronized synchronized
(Splunk)
apache config apache config apache config apache config apache config apache config
SQL Cluster eZ Find Gluster Video CDN File CDN
eZFind Gluster Bricks Transcoding Postmark
SQL Slave Data store System SMTP
Sendlabs
Snapshots Snapshots Snapshots S3 Storage SMTP
Friday, June 17, 2011
19. More cloud-only
Client Site Client Site Client Site Client Site Client Site Client Site Client Site Client Site Client Site Client Site Client Site
Client Site Client Site Client Site Client Site Client Site Client Site Client Site Client Site Client Site Client Site Client Site
Client Site Client Site Client Site Client Site Client Site Client Site Client Site Client Site Client Site Client Site Client Site
SSL SSL Elastic Load Balancers SSL SSL
benefits
Rightscript Big, fast Big, fast Big, fast Big, fast Big, fast Big, fast
powered web app web app web app web app web app web app Reporting/
auto-
server 4
Additional instances
server etc server 1 Analytics/
server 1 server 2 server 3
scaling and
OS, Apache/PHP, OS, Apache/PHP, OS, Apache/PHP, Site OS, caches,
Client Client Site Client Site caches, needed Site
OS, as
Client Site Client caches,Client Site Analysis Site
OS,
Log Client Client Site Client Site Client Site Client Site
scheduled (Splunk)
synchronized synchronized Client Site synchronized
synchronized Client Site synchronizedSite
Client Site Client synchronized
Client Site Client Site Client Site Client Site Client Site Client Site Client Site
min sizes
apache config apache config Client Site apache config
apache config Client Site apache configSite
Client Site Client Client Siteconfig
apache Client Site Client Site Client Site Client Site Client Site Client Site
SSL SSL Elastic Load Balancers SSL SSL
MySQL eZ Find NFS Video CDN File CDN
Rightscript Big, fast Big, fast Big, fast Big, fast Big, fast Big, fast
powered web app web app web app web app web app web app Reporting/
Additional instances
MySQL eZFind
auto-
EBS RAID
server 1 Transcoding server 3
server 2 server 4 server etc server 1 Analytics/
OS, caches, needed caches,
as
scaling and Log Analysis
site OS, Apache/PHP, OS, Apache/PHP, OS, Apache/PHP, OS, caches, OS,
scheduled data, extensions,
Slave Data store
min sizes kernel,synchronized
ini and logs
apache config
System
synchronized
apache config
synchronized
Client Site
apache config
synchronized
Client Site Client Site synchronized
apache config
Client Site
apache config
synchronized Site
Client Site Client
apache config
(Splunk)
Client Site Client Site Client Site Client Site Client Site
Client Site Client Site Client Site Client Site Client Site Client Site Client Site Client Site Client Site Client Site Client Site
Client Site Client Site Client Site Client Site Client Site Client Site Client Site Client Site Client Site Client Site Client Site
Snapshots Snapshots Snapshots S3 Storage
SSL SSL Elastic Load Balancers SSL SSL
MySQL eZ Find NFS Video CDN File CDN
Rightscript Big, fast Big, fast Big, fast Big, fast Big, fast Big, fast
MySQL eZFind auto-
powered EBS RAID
web app Transcoding app
web app web web app web app web app
Additional instances
Reporting/
Analytics/
server 1
site data, extensions, serverSystem server 3
2 server Site
Client 4 server Client Site Client Site 1
etc server
Slave Data storescaling and
scheduled
kernel, ini and logs
OS, Apache/PHP, OS, Apache/PHP, OS, Apache/PHP, OS,Client Site
caches,
Client Site
ClientOS, caches, needed caches,
Site as Site Client Site
Client OS,
Client Site
Log Analysis Site
(Splunk)
Client Site
Client
Client Site
Client Site
Client Site
Client Site
Client Site
Client Site
Client Site
Client Site
Client Site
Client Site
Client Site
min sizes synchronized synchronized synchronized synchronized synchronized synchronized
Client Site Client Site
apache config apache config apache config apache config apache config Site
Client
apacheSite
Client
config Client Site Client Site Client Site Client Site Client Site Client Site Client Site
Snapshots Snapshots Snapshots S3 Storage
SSL SSL Elastic Load Balancers SSL SSL
MySQL eZ Find NFS Big, fast
Video CDN
Big, fast
File CDN
Big, fast Big, fast Big, fast Big, fast
Rightscript
powered web app web app web app web app web app web app Reporting/
auto- Client Site Client Site
serverClient Site
4
Additional instancesSite
Client Site Client Site Client
server etc Client Site server 1
Client Site
Analytics/ Client Site Client Site Client Site Client Site Client Site
server 1 server 2 server 3
MySQL eZFind EBS RAID
scaling and
scheduled
Transcoding
OS, Apache/PHP, OS, Apache/PHP, OS, Apache/PHP,
Client Site
OS, caches,
Client Site
as needed caches,
OS, caches, OS,
Client Site Log Analysis Client Site
Client Site
(Splunk)
Client Site Client Site Client Site Client Site
site data, extensions, synchronized Client Site
synchronized Site
Client synchronized Client Sitesynchronized
Client Site Client Site Client Site Client Site Client Site Client Site Client Site Client Site
Slave Data store min sizes
kernel, ini and logs apache config Systemapache config
synchronized synchronized
apache config apache config apache config apache config
Snapshots Snapshots Snapshots S3 Storage SSL SSL Elastic Load Balancers SSL SSL
Replicate the system:
MySQL eZ Find NFS
Rightscript Big, fast Video CDN
Big, fast File CDN
Big, fast Big, fast Big, fast Big, fast
powered web app web app web app web app web app web app Reporting/
auto-
server 4
Additional instances
server etc server 1 Analytics/
server 1 server 2 server 3
MySQL
scaling and
EBS RAID OS, caches, as needed caches,
OS, caches, OS,
Log Analysis
eZFind scheduled OS, Apache/PHP,
synchronized
min sizes extensions,
Transcoding
OS, Apache/PHP,
synchronized
OS, Apache/PHP,
synchronized synchronized synchronized synchronized
(Splunk)
Dedicated client arrays,
site data,
Slave Data store
apache config
kernel, ini and logs System
apache config apache config apache config apache config apache config
Snapshots Snapshots Snapshots S3 Storage
multiple regions, complete MySQL
MySQL
eZ Find
EBS RAID
NFS Video CDN
Transcoding
File CDN
staging environments, load
eZFind site data, extensions,
Slave Data store kernel, ini and logs System
Snapshots Snapshots Snapshots S3 Storage
testing copies, etc
Friday, June 17, 2011
21. System development costs
Staff research & development time
Maintenance of legacy systems
Pre-launch service subscriptions, monthly fees
Migration time/cost, upgrades
Expert consulting
Friday, June 17, 2011
22. New Ongoing Costs
AWS hourly costs Pingdom (monitoring)
AWS Backup storage Postmark (SMTP)
AWS bandwidth Bits on the Run (Video)
Cloudfront CDN DNS Made Easy
bandwidth (Dynamic DNS)
RightScale (scripted Github (version control)
servers)
About US$5,000/mo Total
Friday, June 17, 2011
23. New Revenue
Higher monthly hosting fees (US$500-800/mo avg)
Much greater hosting capacity (unlimited sites?)
Sell hosting to other developers
Reduced concessions for downtime
Reduced management time, automation
Reduced legacy system costs
Friday, June 17, 2011
24. Thank you.
mark@thinkcreative.com
Mark Marsiglio, President/CEO, ThinkCreative
16 Jun 2011
Friday, June 17, 2011