SlideShare uma empresa Scribd logo
1 de 41
Building Scalable Web
                 Applications for the
                 Cloud

                 Carl Mercier (@cmercier)
                 Director of software development, Websense Inc.
                 Founder, Defensio.com

                 cmercier@websense.com



                                                                   O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
Security	
  for	
  the	
  Social	
  Web

                         We protect your website from spam,
                                malicious content,
                          unwanted URLs and profanity.




Friday, March 12, 2010
The Cloud is
                         different
                         O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
Architecture challenges


                     •   We’re an API, not a website

                     •   Many million requests per day, non stop

                     •   Each requests can be fast or slow

                     •   Very little caching possible



                                                                   O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
Architecture challenges

                     •   Write intensive

                     •   Traffic comes in spikes

                     •   Any downtime is catastrophic

                     •   2 different versions of our APIs

                     •   Bootstrapped startup. We’re broke!


                                                              O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
Getting technical

                     •   Built in Ruby (Rails, Merb and pure Ruby)
                     •   External services written in Perl and C
                     •   100% hosted on Amazon EC2
                     •   Mix of 32 and 64 bit machines
                         •   mostly m1.small (the cheapest ones)




                                                                     O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
Prototyping/1.0 beta
                                       aka The Spaghetti Release




                     •   Single Ruby on Rails application
                     •   No direction whatsoever
                     •   A few small EC2 instances
                     •   A single MySQL




                                                                   O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
Prototyping/1.0 beta
                                                             aka The Spaghetti Release

                                                                            •    Horizontal scaling:
                                                                                 Start more instances
                               DNS Round Robin                              •    This also scaled the website

                         NGINX + API + WEB           NGINX + API + WEB
                                                                            •    Eventually moved MySQL to m1.large


                                             MySQL




                                                                                                            O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
What was wrong?

                     •   Unmaintainable code
                     •   Why did it even work?
                         •   but it REALLY did work, and well! :)

                     •   DNS Round Robin
                     •   Very database intensive



                                                                    O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
The Big Rewrite

                     •   Complete code rewrite
                     •   Proper code separation
                     •   Completely tested
                     •   Ruby + MERB + Datamapper
                     •   Replaced DNS RR with HAProxy
                     •   Added Memcached to the mix


                                                        O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
The Big Rewrite
                                  architecture

                                                    HAProxy




                          NGINX + API (Merb)   NGINX + API (Merb)   NGINX + API (Merb)




                                               MySQL + Memcached




                                                                                         O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
Later Improvements

                     •   Dumped HAProxy (single point of failure)
                         •   replaced with Amazon ELB

                     •   Move Memcached to its own machine
                     •   Decoupled resource-intensive parts
                         •   turned them into web services




                                                                    O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
The Big Rewrite
                         architecture, revisited
                                         Amazon ELB




                                      NGINX + API (Merb)
                                       many EC2 instances




                                            MySQL




                                          Memcached




                                Web Service 1     Web Service n




                                                                  O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
Advantages of this architecture


                     •   Easy to scale horizontally OR vertically
                     •   Each unit can be scaled & tweaked independently
                     •   Easy to maintain
                     •   Increased redundancy




                                                                     O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
MySQL Pain

                     •   Traffic keeps growing
                     •   Adding millions of records per day
                     •   Database size growing exponentially
                     •   Most of this data was non-critical
                     •   Stuck with our schemas and indexes



                                                               O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
Scaling MySQL on EC2

                     •   If your DB fits in memory, don’t worry, be happy!
                     •   It’s painful.
                     •   You should be on EBS or equivalent
                         •   permanent and robust storage

                         •   EBS snapshots




                                                                      O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
Scaling MySQL on EC2


                     •   Scale up (move to a bigger machine)
                         •   More RAM

                         •   Database often IO bound

                     •   RAID 0 (stripe)
                         •   Inconsistent EBS snapshots




                                                               O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
Scaling MySQL on EC2


                     •   Replication
                         •   headache

                         •   all writes go to master

                     •   Split database




                                                       O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
MongoDB


                     •   Document-oriented database
                     •   Schema-less
                     •   Fast
                     •   Replication, fail-over, auto-sharding




                                                                 O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
Three Data Stores

                     •   MySQL (critical data)
                         •   accounts, keys, account settings, statistics

                     •   MongoDB (semi non-critical)
                         •   documents, reputations

                     •   Memcached (non-critical data)
                         •   short term, very fast updates




                                                                            O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
Three Data Stores
                                        Amazon ELB




                                     NGINX + API (Merb)
                                      many EC2 instances




                                           MySQL
                                           m1.small




                                         MongoDB
                                               64-bit




                                        Memcached




                               Web Service 1            Web Service n




                                                                        O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
API 2.0 Challenges


                     •   Completely new API to the user
                     •   Keep support for 1.x
                     •   Asynchronous
                     •   New features, can’t just wrap API 1.x




                                                                 O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
Frontend


                     •   Ruby on Rails
                     •   Accepts HTTP connections
                     •   Knows the API definition for both 1.x and 2.0
                     •   Converts API calls into “jobs”




                                                                        O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
Frontend


                     •   Jobs are put in a queue
                     •   Backend responds with generic response
                     •   Frontend converts response and renders




                                                                  O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
Queue/Messaging: RabbitMQ


                     •    Messaging (AMQP)
                     •    Ultra-fast
                     •    Feature-rich
                     •    Complex (too complex for our needs)




                                                                O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
Queue/Messaging: Beanstalkd


                     •    Ultra-simple simple queue
                     •    Not a messaging server (hack it to make it behave like one!)
                     •    Just as fast as RabbitMQ
                     •    Delayed jobs




                                                                                    O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
Backend

                     •   Previously our “API” servers
                     •   Doesn’t accept HTTP connections anymore
                     •   Communicates through jobs/response (queue)
                     •   API agnostic. Only knows about jobs/response
                     •   All processing/logic
                     •   Spits a response back in the queue


                                                                        O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
Current Architecture     API 2.0


                                                   Amazon ELB


                                                                                             Cluster n
                                           API Frontend (Unicorn + Rails)
                                                 many EC2 instances



                                                 Queue/Messaging
                                                   (Beanstalkd)


                                              Backend (hacked Merb)
                                                 many EC2 instances




                                                        MySQL                      MongoDB
                            Memcached
                                                        m1.small                    64-bit




                                        Web Service 1              Web Service n




                                                                                                   O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
Advantages

                                                                                                      •   Awesome fault-tolerence

                                                                                                      •
                                                Amazon ELB




                                        API Frontend (Unicorn + Rails)
                                              many EC2 instances
                                                                                          Cluster n       Horizontal scaling is easy
                                              Queue/Messaging
                                                (Beanstalkd)


                                           Backend (hacked Merb)
                                                                                                          •   Add capacity to a cluster

                                                                                                          •
                                              many EC2 instances


                                                                                                              Add clusters
                                                     MySQL                      MongoDB
                         Memcached




                                                                                                      •
                                                     m1.small                    64-bit




                                                                                                          No more MySQL scaling worries

                                                                                                      •
                                     Web Service 1              Web Service n



                                                                                                          Complete schema flexibility w/
                                                                                                          MongoDB




                                                                                                                                          O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
When to scale “out”
                                          (horizontally)




                     •   Each instances are identical clones
                     •   Redundancy
                     •   Fast & easy scaling
                     •   Instance is “irrelevant”




                                                               O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
What we scale “out”
                                         (horizontally)




                     •   Frontend
                     •   Backend
                     •   Internal web services




                                                          O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
When to scale “up”
                                            (vertically)




                     •   Multiple instances are hard to manage (eg: database)
                     •   CPU or memory intensive applications
                     •   Scaling out becomes unpractical
                     •   Scaling out becomes cost-ineffective




                                                                            O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
I really like

                         scaling out
                               vs. scaling up




                                                O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
Bulletproof your app


                                            O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
Scale & shrink fast
                         even automatically


                                               O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
Most cost-effective


                                               O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
Things I learned


                     •   Cloud instances are disposable
                     •   Architect your app accordingly
                     •   Instances should be killed, not fixed




                                                                O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
Things I learned

                     •   Pre-optimizing is useless
                     •   Be aware of your bottlenecks
                     •   Architect your application for flexibility
                     •   Deploy different parts to different servers
                     •   Secure your important data



                                                                       O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
Things I learned (about EC2)


                     •   It is pretty reliable, anything else you heard is a myth
                     •   When shit hits the fan, you’re on your own
                     •   Create images
                     •   Automate as much as you can




                                                                                    O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
Things I learned (about EC2)


                     •   Auto-scaling is easy, but rarely needed
                     •   IO is inconsistent and mostly sucks
                     •   Slowish (Rackspace Cloud is much faster)
                     •   Large(r) instances are too expensive




                                                                    O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010
Questions?

                                           Twitter: @cmercier and @defensio
                                             Email: cmercier@websense.com
                                                            Web: www.defensio.com



                         O U T S M A R T I N G E V I L S PA M




Friday, March 12, 2010

Mais conteúdo relacionado

Semelhante a Building Scalable Web Applications For The Cloud

Cloud Foundry and Ubuntu - 2012
Cloud Foundry and Ubuntu - 2012Cloud Foundry and Ubuntu - 2012
Cloud Foundry and Ubuntu - 2012Patrick Chanezon
 
Scaling a MeteorJS SaaS app on AWS
Scaling a MeteorJS SaaS app on AWSScaling a MeteorJS SaaS app on AWS
Scaling a MeteorJS SaaS app on AWSBrett McLain
 
Dan node meetup_socket_talk
Dan node meetup_socket_talkDan node meetup_socket_talk
Dan node meetup_socket_talkIshi von Meier
 
Node and Micro-Services at IBM
Node and Micro-Services at IBMNode and Micro-Services at IBM
Node and Micro-Services at IBMDejan Glozic
 
Why Node, Express and Postgres - presented 23 Feb 15, Talkjs, Microsoft Audit...
Why Node, Express and Postgres - presented 23 Feb 15, Talkjs, Microsoft Audit...Why Node, Express and Postgres - presented 23 Feb 15, Talkjs, Microsoft Audit...
Why Node, Express and Postgres - presented 23 Feb 15, Talkjs, Microsoft Audit...Calvin Tan
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudyJohn Adams
 
Core Data in RubyMotion #inspect
Core Data in RubyMotion #inspectCore Data in RubyMotion #inspect
Core Data in RubyMotion #inspectLori Olson
 
Use all the buzzwords
Use all the buzzwordsUse all the buzzwords
Use all the buzzwordsJared Faris
 
PyCon 2011 Scaling Disqus
PyCon 2011 Scaling DisqusPyCon 2011 Scaling Disqus
PyCon 2011 Scaling Disquszeeg
 
Running MongoDB in the Cloud
Running MongoDB in the CloudRunning MongoDB in the Cloud
Running MongoDB in the CloudTony Tam
 
DevNation Atlanta
DevNation AtlantaDevNation Atlanta
DevNation Atlantaboorad
 
NoSQL in the context of Social Web
NoSQL in the context of Social WebNoSQL in the context of Social Web
NoSQL in the context of Social WebBogdan Gaza
 
Torquebox rubyhoedown-2012
Torquebox rubyhoedown-2012Torquebox rubyhoedown-2012
Torquebox rubyhoedown-2012Lance Ball
 
Five Years of EC2 Distilled
Five Years of EC2 DistilledFive Years of EC2 Distilled
Five Years of EC2 DistilledGrig Gheorghiu
 
Google and IPv6: Steinar H. Gunderson, Software engineer, Google
Google and IPv6: Steinar H. Gunderson, Software engineer, GoogleGoogle and IPv6: Steinar H. Gunderson, Software engineer, Google
Google and IPv6: Steinar H. Gunderson, Software engineer, GoogleIPv6no
 
Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)Don Demcsak
 
Big Data Israel Meetup : Couchbase and Big Data
Big Data Israel Meetup : Couchbase and Big DataBig Data Israel Meetup : Couchbase and Big Data
Big Data Israel Meetup : Couchbase and Big DataTugdual Grall
 

Semelhante a Building Scalable Web Applications For The Cloud (20)

Cloud Foundry and Ubuntu - 2012
Cloud Foundry and Ubuntu - 2012Cloud Foundry and Ubuntu - 2012
Cloud Foundry and Ubuntu - 2012
 
Scaling a MeteorJS SaaS app on AWS
Scaling a MeteorJS SaaS app on AWSScaling a MeteorJS SaaS app on AWS
Scaling a MeteorJS SaaS app on AWS
 
Dan node meetup_socket_talk
Dan node meetup_socket_talkDan node meetup_socket_talk
Dan node meetup_socket_talk
 
Node and Micro-Services at IBM
Node and Micro-Services at IBMNode and Micro-Services at IBM
Node and Micro-Services at IBM
 
Why Node, Express and Postgres - presented 23 Feb 15, Talkjs, Microsoft Audit...
Why Node, Express and Postgres - presented 23 Feb 15, Talkjs, Microsoft Audit...Why Node, Express and Postgres - presented 23 Feb 15, Talkjs, Microsoft Audit...
Why Node, Express and Postgres - presented 23 Feb 15, Talkjs, Microsoft Audit...
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudy
 
Stackato v2
Stackato v2Stackato v2
Stackato v2
 
Architecting for failure
Architecting for failureArchitecting for failure
Architecting for failure
 
Core Data in RubyMotion #inspect
Core Data in RubyMotion #inspectCore Data in RubyMotion #inspect
Core Data in RubyMotion #inspect
 
Use all the buzzwords
Use all the buzzwordsUse all the buzzwords
Use all the buzzwords
 
20100301icde
20100301icde20100301icde
20100301icde
 
PyCon 2011 Scaling Disqus
PyCon 2011 Scaling DisqusPyCon 2011 Scaling Disqus
PyCon 2011 Scaling Disqus
 
Running MongoDB in the Cloud
Running MongoDB in the CloudRunning MongoDB in the Cloud
Running MongoDB in the Cloud
 
DevNation Atlanta
DevNation AtlantaDevNation Atlanta
DevNation Atlanta
 
NoSQL in the context of Social Web
NoSQL in the context of Social WebNoSQL in the context of Social Web
NoSQL in the context of Social Web
 
Torquebox rubyhoedown-2012
Torquebox rubyhoedown-2012Torquebox rubyhoedown-2012
Torquebox rubyhoedown-2012
 
Five Years of EC2 Distilled
Five Years of EC2 DistilledFive Years of EC2 Distilled
Five Years of EC2 Distilled
 
Google and IPv6: Steinar H. Gunderson, Software engineer, Google
Google and IPv6: Steinar H. Gunderson, Software engineer, GoogleGoogle and IPv6: Steinar H. Gunderson, Software engineer, Google
Google and IPv6: Steinar H. Gunderson, Software engineer, Google
 
Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)
 
Big Data Israel Meetup : Couchbase and Big Data
Big Data Israel Meetup : Couchbase and Big DataBig Data Israel Meetup : Couchbase and Big Data
Big Data Israel Meetup : Couchbase and Big Data
 

Último

How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 

Último (20)

How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 

Building Scalable Web Applications For The Cloud

  • 1. Building Scalable Web Applications for the Cloud Carl Mercier (@cmercier) Director of software development, Websense Inc. Founder, Defensio.com cmercier@websense.com O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 2. Security  for  the  Social  Web We protect your website from spam, malicious content, unwanted URLs and profanity. Friday, March 12, 2010
  • 3. The Cloud is different O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 4. Architecture challenges • We’re an API, not a website • Many million requests per day, non stop • Each requests can be fast or slow • Very little caching possible O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 5. Architecture challenges • Write intensive • Traffic comes in spikes • Any downtime is catastrophic • 2 different versions of our APIs • Bootstrapped startup. We’re broke! O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 6. Getting technical • Built in Ruby (Rails, Merb and pure Ruby) • External services written in Perl and C • 100% hosted on Amazon EC2 • Mix of 32 and 64 bit machines • mostly m1.small (the cheapest ones) O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 7. Prototyping/1.0 beta aka The Spaghetti Release • Single Ruby on Rails application • No direction whatsoever • A few small EC2 instances • A single MySQL O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 8. Prototyping/1.0 beta aka The Spaghetti Release • Horizontal scaling: Start more instances DNS Round Robin • This also scaled the website NGINX + API + WEB NGINX + API + WEB • Eventually moved MySQL to m1.large MySQL O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 9. What was wrong? • Unmaintainable code • Why did it even work? • but it REALLY did work, and well! :) • DNS Round Robin • Very database intensive O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 10. The Big Rewrite • Complete code rewrite • Proper code separation • Completely tested • Ruby + MERB + Datamapper • Replaced DNS RR with HAProxy • Added Memcached to the mix O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 11. The Big Rewrite architecture HAProxy NGINX + API (Merb) NGINX + API (Merb) NGINX + API (Merb) MySQL + Memcached O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 12. Later Improvements • Dumped HAProxy (single point of failure) • replaced with Amazon ELB • Move Memcached to its own machine • Decoupled resource-intensive parts • turned them into web services O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 13. The Big Rewrite architecture, revisited Amazon ELB NGINX + API (Merb) many EC2 instances MySQL Memcached Web Service 1 Web Service n O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 14. Advantages of this architecture • Easy to scale horizontally OR vertically • Each unit can be scaled & tweaked independently • Easy to maintain • Increased redundancy O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 15. MySQL Pain • Traffic keeps growing • Adding millions of records per day • Database size growing exponentially • Most of this data was non-critical • Stuck with our schemas and indexes O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 16. Scaling MySQL on EC2 • If your DB fits in memory, don’t worry, be happy! • It’s painful. • You should be on EBS or equivalent • permanent and robust storage • EBS snapshots O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 17. Scaling MySQL on EC2 • Scale up (move to a bigger machine) • More RAM • Database often IO bound • RAID 0 (stripe) • Inconsistent EBS snapshots O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 18. Scaling MySQL on EC2 • Replication • headache • all writes go to master • Split database O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 19. MongoDB • Document-oriented database • Schema-less • Fast • Replication, fail-over, auto-sharding O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 20. Three Data Stores • MySQL (critical data) • accounts, keys, account settings, statistics • MongoDB (semi non-critical) • documents, reputations • Memcached (non-critical data) • short term, very fast updates O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 21. Three Data Stores Amazon ELB NGINX + API (Merb) many EC2 instances MySQL m1.small MongoDB 64-bit Memcached Web Service 1 Web Service n O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 22. API 2.0 Challenges • Completely new API to the user • Keep support for 1.x • Asynchronous • New features, can’t just wrap API 1.x O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 23. Frontend • Ruby on Rails • Accepts HTTP connections • Knows the API definition for both 1.x and 2.0 • Converts API calls into “jobs” O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 24. Frontend • Jobs are put in a queue • Backend responds with generic response • Frontend converts response and renders O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 25. Queue/Messaging: RabbitMQ • Messaging (AMQP) • Ultra-fast • Feature-rich • Complex (too complex for our needs) O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 26. Queue/Messaging: Beanstalkd • Ultra-simple simple queue • Not a messaging server (hack it to make it behave like one!) • Just as fast as RabbitMQ • Delayed jobs O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 27. Backend • Previously our “API” servers • Doesn’t accept HTTP connections anymore • Communicates through jobs/response (queue) • API agnostic. Only knows about jobs/response • All processing/logic • Spits a response back in the queue O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 28. Current Architecture API 2.0 Amazon ELB Cluster n API Frontend (Unicorn + Rails) many EC2 instances Queue/Messaging (Beanstalkd) Backend (hacked Merb) many EC2 instances MySQL MongoDB Memcached m1.small 64-bit Web Service 1 Web Service n O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 29. Advantages • Awesome fault-tolerence • Amazon ELB API Frontend (Unicorn + Rails) many EC2 instances Cluster n Horizontal scaling is easy Queue/Messaging (Beanstalkd) Backend (hacked Merb) • Add capacity to a cluster • many EC2 instances Add clusters MySQL MongoDB Memcached • m1.small 64-bit No more MySQL scaling worries • Web Service 1 Web Service n Complete schema flexibility w/ MongoDB O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 30. When to scale “out” (horizontally) • Each instances are identical clones • Redundancy • Fast & easy scaling • Instance is “irrelevant” O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 31. What we scale “out” (horizontally) • Frontend • Backend • Internal web services O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 32. When to scale “up” (vertically) • Multiple instances are hard to manage (eg: database) • CPU or memory intensive applications • Scaling out becomes unpractical • Scaling out becomes cost-ineffective O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 33. I really like scaling out vs. scaling up O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 34. Bulletproof your app O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 35. Scale & shrink fast even automatically O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 36. Most cost-effective O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 37. Things I learned • Cloud instances are disposable • Architect your app accordingly • Instances should be killed, not fixed O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 38. Things I learned • Pre-optimizing is useless • Be aware of your bottlenecks • Architect your application for flexibility • Deploy different parts to different servers • Secure your important data O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 39. Things I learned (about EC2) • It is pretty reliable, anything else you heard is a myth • When shit hits the fan, you’re on your own • Create images • Automate as much as you can O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 40. Things I learned (about EC2) • Auto-scaling is easy, but rarely needed • IO is inconsistent and mostly sucks • Slowish (Rackspace Cloud is much faster) • Large(r) instances are too expensive O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010
  • 41. Questions? Twitter: @cmercier and @defensio Email: cmercier@websense.com Web: www.defensio.com O U T S M A R T I N G E V I L S PA M Friday, March 12, 2010