These are my summarized notes from all the microservices session I attended at QCon 2015. These sessions had tons of learning around how to scale microservices and avoid common pitfalls
3. KeyNote: Micro Chips to
Microservices
• 1945 - 2015 - We have increased the calc/sec by 1 Trillion
• Hardware mainly scaled by miniaturization & abstraction like plug n play.
• Software doesn’t really scale by abstraction e.g. the software we write today hasn’t
abstracted much after high level language e.g. fortran and HTML are in the same
generation.
• Software scales by federation and wide participation
• Wide participation means lots of different people bring their knowledge to find out different
things and contribute to knowledge base.
• Federated architecture means you can do stuff like create your own website, app and you
can do whatever you want individually without effecting other parts of internet or app
ecosystem and then bring your learning and share with rest of the world.
• So if you are thinking of scaling something think about how can you federate and then
share the learnings from each federation.
4. KeyNote: Micro Chips to
Microservices
One of the ways to do this federated architecture is Micro-
services and here are few points about it:
1. Has to be independently Deployable (fundamental value)
2. End-to-end responsibility of one single small team
3. No Central Database
4. Extensive Automation & Monitoring
5. Smart Versioning Services
5. KeyNote: Micro Chips to
Microservices
Checklist before you start embarking on Micro-services track
1. Is it right for the domain? (usually very high volumes)
2. Do you understand the domain boundaries? (refactoring
across services is very hard)
3. Can you maintain strict discipline? (interaction restricted to
hardened interfaces)
4. Can you have high situational awareness about your
systems? i.e. consumer knows that producer before
deployment that interfaces are same not only relying on
Mocks
6. KeyNote: Micro Chips to
Microservices
Strategies for Monolith
• Start packing related code into a container
• For complex system, large releases can cause lot of
instability. So you should do small continuous
deployment
• Deployment is different from releases, switching
feature flags on and off is considered releasing
8. Netflix’s Viewing Data Microservices
• Netflix Scale
• 62 Million members
• 50+ counteries
• 3 Billion Hours/Month
• 1000+ device types
• 37% Downstream bandwidth of N/A
9. Netflix’s Viewing Data Microservices
• Viewing Data Services need to calculate
• Who, What, When, Where (physical and device) and how long they
watched
• How does this benefits users?
• How does this benefit Users?
• Pickup where you left off (switch devices)
• MyActivity timeline
• Helps Netflix see the quality like what was buffer speed etc.
• Helps them subscription purchase e.g. they figured out that Adam
Sandler does well in all the regions, so they made a deal with him
10. Netflix’s Viewing Data Microservices
High volume events like Hearbeats events and Read Last events both going
to stateful tier
System Architecture before Microservices
13. Netflix’s Viewing Data Microservices
• Why change? (assume all this risk and cost)
• Current System would have worked really great for next few years,
monolith are not a bad thing and many times its actually great
• They wanted to do before the imminent need so they can do
mindfully
• They wanted to work in a mode where there is no maintenance
mode
• Rapid growth due to Virtuous Cycle: Viewing - Improved
Personalization - Better Experience - more viewing
• Stateful instance count remain same 24/7 regardless of loads
14. Netflix’s Viewing Data Microservices
• How did we do that?
• Shadow Testing
• All the request goes through both the systems legacy and micro
services but only the legacy system serve the users.
• This helps not only making sure that its operating correctly but also that
its working properly at scale
• Traffic Dial
• To do this they needed to make sure they have a consistsnt view of the
world. So they sacrificed bit of pure microservice system and removed
persistent from services and pointed them to old legacy system
• 1% of the traffic was directly hitting some of the service then dial up
from there to 100%
15. Netflix’s Viewing Data Microservices
Key Points
• Devour the whale a bite at a time
• Design for idempotency so it can replayed (using
something like CQRS/Event Sourcing)
• System architectures are throw-away artefacts,
useful for only a limited time. Design for 10X plan
to rewrite before 100x - System Arc is a throw
away artifact.
17. Engineering for Scale at VMTurbo
• Who are VM Turbo?
• VMTurbo is a data center control system and does automatic
resource allocation
• They do that by creating marketplace between workload
(applications, VMs, containers) as buyers and CPU, storage, fabric
as sellers
• Why not architecture from microservices from beginning?
• Monolith allows you to explore the complexity of a system and its
component boundaries
• Martin fowler said in a recent article how you shouldn’t start with
microservices in the beginning to get understanding of your domain
19. Engineering for Scale at VMTurbo
• Problems
• Release cycle for 6 months with interim patches
• No metrics captured
• Monolothic team because of monolithic architecture
• Monolothic Architecture caused scalability, concurrency and tangled interface
between components.
• Catalysts for Change
• Growth in customer base
• More Large Environments
• Geographical Spread of Team
• More Frequent Deleviries
20. Engineering for Scale at VMTurbo
• Steps to create first service
• Clean up the interface between Mediation (service to
be created) and the Analysis component
• Separate Mediation component completely from
Monolith
• Publish APIs - so anyone can write mediation
component and works with VMTurbo Analysis
21. Microservices and the art of
taming the dependency hell
Michael Bryzek Cofounder & ex-CTO Gilt
@mbryzek
mbryzek@alum.mit.edu
22. Microservices and the art of taming
the dependency hell
• Gilt which is an organization with 1500 git repos and
over 150 micorservices. Its about 1000 people with
150+ people in tech
• Gilt started with very basic Rails monolithic
architecture
• They started with excessive caching but soon they
realize that its very hard to reason about how adding
any new feature will effect in realtime.
• They started minimizing caching but depending on
very fast data store
23. Microservices and the art of taming
the dependency hell
• They started extracting the highly available parts of the application
and put them behind service. Few services they extracted were:
• User Service with 10K RPS with millions of users. They went for
mongoDB to give absolute consistency instead of eventual
consistency.
• Catalog Service with 5K RPS and used relational DB
• Inventory Service with 10K RPS+ with guarantee never oversold
and used HBase.
• Cart Service with low throughput and used Dynamo DB
• Any significant features they add start becoming the service.
24. Microservices and the art of taming
the dependency hell
• What are the problems once they started adding services?
• Builds get larger and slower - You keep depending on services and
start adding client libraries for services now you have world
downloaded
• Create new client libraries that are each just a little bit different
• Produce custom APIs instead of consistent APIs that reduce
interoperability
• Increase amount of boilerplate code
• Reduce code quality; slow down development
• And Eventually you will see a production error
25. Microservices and the art of taming
the dependency hell
• To minimize all those pains they followed guiding principle
called The Open Source Way.
• This means making the way we build propriety software
similar to the way open source is done. In the following
areas:
• How do they provide documentation?
• How does the library integrates with other apps?
• How do i get support/contribute/report bugs?
• public or private is a detail.
26. Microservices and the art of taming
the dependency hell
Some specific strategies to avoid these problems and manage dependencies are
1. Tooling Matters
• Anyone who has succeeded with Microservices is they have used extensive
tooling to automate stuff.
2. API Design must be first class
• Design of your API and the data structures are hardest to change
afterwards.
• You can tools like Protobufs, thrift, avro, swagger 2.0 and apidoc to make
schema as your first class and makes it very easy for consumer to knows
what data its getting.
• Schema first design is the most important concept to avoid the dependency
hell.
27. Microservices and the art of taming
the dependency hell
3. Accurate Documentation
• Documentation should be of the similar to the amount needed for an open
source project to be successful.
• Using semantic versioning to point out if there is any breaking changes.
• Accurate documentation can be achieved by producing the documentation in
software process
4. Generating Client Side Libraries
• Makes it easy to test
• Reduces lot of boilerplate
• Consistent naming
• You can minimize the external dependencies
28. Microservices and the art of taming
the dependency hell
5. Backward Compatibility
• Renaming just doesn’t work.
• Introduce new model, migrate and deprecate all the old stuff
6. Forward Compatibility
• Your service shouldn’t blow up if new field is added or seen by
the system.
• Careful of enums, what happens when you add a value in the
future.
• Don’t throw exception if new field shows up
29. Scaling Stack Overflow: Keeping it
Vertical by Obsessing Over Performance
(case for monoliths)
David Fullerton
@df07
30. Scaling Stack Overflow: Keeping it Vertical by Obsessing
Over Performance
• 4 Billion Requests/Month, 3K requests/s, 45 M uniques/month, 8K qs/month
and 500,000 page-views/month
• 34 Devs, 6 sysadmins, 6 designers, 75% remote
• Their Architecture
• 2 HAProxy - one failover
• 9 web servers
• 4 SQL Servers (Vertical scaling - 2 clusters)
• 2 Redis Servers
• 3 Elastic Servers
• 3 Tag Engine
32. Scaling Stack Overflow: Keeping it Vertical by Obsessing
Over Performance
• It’s what they called Monolith Plus most of the stuff happens in web
tier and DB.
• It scales really good for them.
• All of their web servers and SQL servers are running under 10% CPU
consumption and majority of the RAM consumption is under 70%
• Deploys all day everyday, deploy through web tier in 3 minutes. This
gives them huge ability to test on production since they can roll out so
fast
• Testing on users, few unit test and integration test not lot of
automated tests.
• Big believer in feature flags.
33. Scaling Stack Overflow: Keeping it Vertical by Obsessing
Over Performance
!
• This works specifically for stackoverflow because
• Read-heavy load centered on one page.
• Forgiving community — they released the bug with alert on the homepage
• How do they work?
• Start with what they know
• Measure it live
• Fix the slow - because performance is a feature
• Use excessive caching
• Optimizing for performance instead of scaling out
34. Scaling Stack Overflow: Keeping it
Vertical by Obsessing Over Performance
“my primary guideline would be don’t even consider
micro service unless you have a system that’s too
complex to manage as a monolith” — Martin Fowler
35. The Seven Deadly Sins of
Microservices
Daniel Bryant
@danielbryantuk
36. Seven Deadly Sins of Microservices
1. Lust - using the latest and greatest
• Choose Boring Technology
• Use Matt Raibel’s comparison framework to add objectivity
2. Gluttony - Excessive communication protocols
• Choose initially only one Sync (e.g. JSON over HTTP) and one Async ( eg. RabbitMQ) protocol
3. Greed - All your Service are belong to us
• Don’t underestimate the effect it will happen on your organization not necessarily only technical
side
• Few good books - The Connected Company, The Modern Firm, On Chnage Management
4. Sloth - Creating a distributed monolith
• If you can’t deploy services independently then you are not doing micro services
37. Seven Deadly Sins of Microservices
5. Wrath - Blowing up when bad things happen
• Putting the chaos monkey is really useful to put in your deployment pipeline
• Read up on Release It!
6. Envy - The shared single domain fallacy
• One model breaks encapsulation and introduces coupiing
• Know DDD
7. Pride - Testing the world of transience
• There is a mindset change in how you are testing
• Invest in your Build Pipeline
• Use Serenity BDD
• Wiremock - testing in jenkins fault tolerence
• Testing in production? - Netflix and Gilt - once you reach at certain level of services only way to test is in
production
38. Summary
• Know your domains very well before you start creating microservices because refactoring
across services is very hard
• You are not doing microservice if you are not independently deploying
• Make API design first class by using tools like swagger and apidoc
• Architect for failure and build the failure testing in your build pipeline by using tools like
Wiremock
• Testing is really hard because its impossible to have the full view of the system all the time. So
have to invest in tools, API documentation and build pipeline to better gauge your system.
• Use generated clients to avoid tons of boilerplate
• Don’t borrow other people problems, figure out your own pain points.
• Microservices have a hefty tax and it usually worth it if the team size, complexity or the scale
of the app is growing
• Monoloiths maybe the best architecture in certain domains eg. Stackoverflow