Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Â
Scaling High Traffic Web Applications
1.
2. About Me
⢠Joined Achievers in June 2009
⢠Prior to Achievers, I was the CTO of ZipLocal
⢠I have spent the last 7 years worrying about
how to build scalable applications
⢠Academic Background:
â Ph.D. from the University of Toronto
â Naval Research Labs Post Doctoral Fellow of
Secure Systems at Cambridge University
3. Goals
⢠Tell you about our journey to a scalable
architecture
⢠Give you insight into common scaling
problems
⢠Give you a way to think about the issues of
scaling that you can apply today
5. What Does Achievers Do
⢠Achievers started in rewards and recognition
space in 2007
⢠We provide reward and recognition software
â Points based system to reward performance
â Catalog to redeem the points
⢠Our mission is to âChange the way the world
worksâ
7. Our Traffic Growth
⢠From 2009 to today
â Visits up 903%
â Unique Visitors up 832%
⢠Last month we did 2.5 million page views
⢠During business hours we have about 250
people on the site at any given moment
8. Funding
⢠3.3 million Series A from JLA Ventures
⢠6.9 million Series B form Grandbanks
⢠24 million Series C from Sequoia Capital
10. Definitions
⢠Performance
â Performance measures the speed which a single
request can be executed
⢠Scalability
â Scalability is the ability to handle a growing
number of requests in a capable manner
Scalability != Performance
11. Which Language Scales the Best?
⢠Languages Donât Scale Architectures Do
⢠If you hear âlanguage X doesnât scaleâ then
turn around and walk away.
â That person doesnât understand scalability
12. There is a bit more to Scalability
⢠Scalability is also about how you scale the
development team
⢠If you are successful and need to add people
how easy is it for them to contribute
⢠How fast can you write code
â Your competitors are right behind you
â He who can develop good code fast wins!
14. The Achievers Platform
⢠Multi tenant architecture
â One code base
â One database
⢠Module based platform
â Hundreds of configuration options for each
module
â Lots of legacy configurations
15. Backend Processing
⢠We handle many millions of dollars of orders
every month
⢠We send out hundreds of thousands of emails
a month
17. The Stack
⢠Pretty Standard J2EE stack
⢠Hibernate
⢠Spring
⢠JMS
⢠MySql
⢠All running on Amazon EC2
18. Aside â Amazon EC2
⢠EC2 is great
⢠Spin up machines for testing then shut them
down
⢠A must for any startup
â Donât manage your own servers when you are
small. It isnât worth it
24. Scaling Was an Afterthought
⢠We had to scale vertically since the underlying
design did not consider what would happen if
we had 2 web servers
⢠We had the largest EC2 instance money could
buy
⢠You cannot retrofit scalability
â Your architecture and design either have it or they
doesnât
25. Design Decisions
⢠Your basic approach and philosophy to a few
things will determine how hard it will be to
scale your infrastructure
27. Who doesnât like magic
⢠Extensive use of Aspect Oriented
Programming (AOP)
â Allows you to define âcut-pointsâ to insert code
before or after a function call
⢠As an academic AOP is brilliant
⢠As a CTO not so much
28. There is a Pattern for That
⢠Use of design patterns for the sake of using a
design pattern
⢠Donât get me wrong every developer must
know and understand design patterns
⢠But it isnât a competition to see who can use
the most design patterns in any given day
â The right tool for the right job
â Donât force it!
29. Overly complex object model
⢠The Access Control model had so many
objects and relationships that other than the
original author no other person ever
understood it
30. Why is Complexity Bad?
⢠If the system dies at two o'clock in the
morning and I'm staring at your code, can I
easily figure out what's going on?
⢠People Forget about Magic
â Code needs to be in front of you not buried in an
XML file or magically invoked
31. What Does This Have To Do With
Scalability?
⢠Complex systems are really, really hard to
scale
â In a clustered environment you need to first figure
out if the problem is because of clustering or
because of your code
â This isnât trivial even for simple systems
⢠To many things to worry about
⢠When you hit a wall (and you will) it becomes
very hard to figure out what to do
32. Donât Forget About the People
⢠As you grow your team you need to ramp
everybody up
⢠A complex system takes longer to learn than a
simple one
⢠Complexity ALWAYS increases over time. If
you start with something that is complex it
will quickly get beyond the scope of a meer
mortal
35. The Database
⢠ORMs make you stupid ⌠kidding ⌠sort of
⢠You need to understand your data
â Do not let an ORM define your database you will be
sorry
⢠Generating reports out of an ORM is painful
⢠Developers must understand how a DB works
â You will forget about what a DB is good for if you
donât consider it explicitly
â New developers usually do not understand the
importance of the DB in scaling
36. ORMâs
⢠Can they scale?
â Sure
⢠Is it hard?
â Yup
⢠A quote from stackoverflow on scaling ORMâs
â â⌠a good ORM will provide plenty of hooks that
allow you to optimize quite a bit. You just need to
spend some time learning it.â
37. Is that all?
⢠Initially ORMs might allow you to write code
quickly
â I would challenge this but that is another topic
⢠Your system runs into a brick wall. Customers
are complaining. Your CEO is chewing out the
CTO. The VP Engineering is curled up in a ball
in the corner. They turn to you as the
architect and you answer:
âWe just need to learn how to use all the hooksâ
38. Just Learn the ORM
⢠I have yet to meet somebody that could
convince me that they knew how to scale an
ORM
â It HAS been done, so yes it is possible but it takes
patience and a CEO that likes to wait
â Iâve had people tell me âwe just have to rewrite
the ORM with a new ORM that could scaleâ
39. Know your database
⢠I believe that your DB should own all your
data
â Let it do what it is good at
⢠If that is true then simple replication
strategies and a little bit of coding can get you
reading data from a replica
⢠You can then start denormalizing the DB to get
better performance
40. Scaling Your Data
⢠Scaling a DB is a well understood problem
with well understood solutions
⢠Donât confuse this with easy!
42. Server Side Sessions
⢠Very developer friendly
⢠You have 2 choices to scale:
â Session replication
â Sticky Sessions
43. Session Replication
⢠Yuck!
⢠Lots of network chatter
⢠Slow propagation of the session means the
user has a bad experience
⢠You could be moving lots of data around
â Our sessions were huge
44. Sticky Sessions
⢠Works but you now need to worry about a
machine being overloaded while the others
are idle
⢠A machine failure logs out everybody from
that machine
⢠You have be very careful when configuring
â If all IP addresses go to one server then you
essentially have one company per server
46. When to Cache
⢠Our platform made extensive use of caches
⢠That has to be good right?
⢠Not in our case
â Items were cached by Java
â Shared state posed a problem when adding
another server
â Yes there are Java based solutions but all you are
doing is adding complexity
48. It Wonât Love You Back
⢠Never fall in love with your technology. It will
break your heart.
⢠You must always challenge your assumptions
and be prepared to throw away something
â Hard to throw away your âbabyâ
â Remember it is just a bunch of 1âs and 0âs
50. Basic Premise
⢠Every web application follows the same basic
flow:
1. User makes a request
2. Validate the request
3. Grab some data
4. Process it a bit
5. Build a Page for the user
51. Guiding Architectural Principles
⢠Initial deployment would be on 3 machines
â Forcing us to understand how we are going to scale
upfront
⢠Servers must be stateless
⢠The database owns all the data
⢠Caching is an explicit choice to solve a real
problem
⢠Always use the right tool for the job
⢠Minimize complexity
52. Other Goals
⢠Zero downtime deployments
⢠We wanted to be able upgrade customers one
at a time
⢠Maximize developer productivity
53. The Target
Load Balancer
Web Server Web Server Web Server
Background
MemcacheD NAS Processing
Cluster Device MySql MySql
Master Slave
54. The Language Choice
⢠Why PHP
â Faster code/debug cycles
⢠This has increased our productivity
â Zero downtime deployments
⢠We have patched running servers multiple times in a
day and nobody has noticed anything
â Shared nothing philosophy
⢠Forces a good frame of mind for server development
55. Doesnât PHP Suck?
⢠Languages donât suck only the developers
using them do
⢠PHP isnât perfect
â Google âwhy php sucksâ for an extensive list
⢠But PHP doesnât scale
â Remember, languages donât scale âŚ
â If you donât believe me ask
Wikipedia, Facebook, Digg etc.
56. Sure but PHP is Slow
⢠If your web application is not database bound
then you are probably doing it wrong
⢠Yes Java might perform at some things but
that will not be a limiting factor
57. Surely There are Down Sides?
⢠Because PHP does not have strong typing you
need really good error detection and reporting
â We will do another talk on our struggles and
solutions
⢠Coding standards are a must since PHP lets
you pretty much do whatever you want
â Naming conventions are super important
â Donât start a religious war over bracket placement.
There really is only one right way ď
58. The Framework
⢠We use Codeigniter (CI)
⢠Simple MVC framework
â The code is very easy to follow
⢠Works out of the box, but is very extensible
â Strictly follows the Open/Closed principle
â We have extended CI a lot to meet our needs
⢠Doesnât require learning anything but PHP
59. Using the Right Tool
⢠Have Apache (or a faster web server) server all
static content
⢠A Network Attached Storage (NAS) device was
used for a shared file system.
â This makes life a TON easier
⢠Have your web servers serve requests
⢠Move background work to another server
60. The Problem
⢠We had about 120 customers and we couldnât
just go away to do what we needed to do
â Not a bad problem to have
62. Step 1
⢠We wrote a controller that would forward
requests to the new code base
⢠GET requests could be easily forwarded
⢠POST request were a bit more complicated
⢠This step allowed us to start developing the
new platform AND keep releasing features
63. Step 2
⢠Start migrating customers to the new platform
⢠We put a proxy server in front of our old and
new platforms.
⢠We then proxied specific requests to the
version they were running on
64. The Setup
HAProxy
Express Achievers
Platform Platform
MySql
65. HAProxy
⢠If you donât have it installed go back to the
office download it and install it!
⢠It isnât just a load balancer
â We can move specific traffic to specific machines
for whatever reason
â We have a machine with profiling capabilities that
we have used to profile production problems
â Fine grain control over your request
66. We did it!
⢠It took us almost 6 months to migrate every
customer but we did get there
⢠Our productivity has improved
⢠And we have an architecture that we know
can handle whatever we can throw at it
â At least in the short term
68. Scaling is Hard
⢠Donât make it harder on yourself
â Reduce complexity
â Understand your database
â Have an upfront strategy to deal with state
⢠We picked stateless but you donât have to
69. Never let anybody tell you a
language or framework does or
doesnât scale.
It is all in the details