This document summarizes an presentation about architecting scalable web applications. It discusses key concepts like scalability versus performance, load versus stress testing, common patterns for scaling like partitioning applications and databases, balancing computing loads, caching responsibly, and planning for concurrency. It also lists some anti-patterns to avoid and provides resources for further reading on scaling applications.
27. Anti-patterns Blaming another department “It works great on my machine!” Attempting to maintain state Spending all your time looking at the code Caching everything (twice!) Services calling services especially across the network or networks
Hopefully you are here to learn how to avoid scalability problems, but there are doubtless people in this room who have experienced scalability problems are who are currently experiencing them. Hopefully that means that you have an application that has exceeded your expectations and you are reaping the rewards of your hard work!
One of the first things that we would like to address is the common confusion between the very related topics of performance and scalability. Most of the time you will actually here the two terms used like that: with the “and” right in the middle of them. The truth of the matter is that the two are related, but sometimes they actually work against each other. Things that you do for scalability can actually hurt performance. Adding a load balancer (as fast as the devices are) actually adds a finite amount of time to each and every request that hurts performance. For the purposes of our discussion today we will use the following definition (put forth by Richard Campbell of the .NET Rocks Show):
When we talk about scalability, often linear scalability comes to mind. Linear scalability, relative to load, means that with fixed resources, performance decreases at a constant rate relative to load increases. Linear scalability, relative to resources, means that with a constant load, performance improves at a constant rate relative to additional resources. (click slide)The reality with scaling web applications is not a straight line, often times you will see a step function emerge. There is a base to this type of step function, to process zero requests in your web application still requires some infrastructure. But you will see that you can reasonably process a certain amount of load, before you need to “step up” to the next level. The problem with the step functions is that at a point, each additional “step” costs you more to take after a certain point (click slide).Our ideal scenario is that we have a relatively flat line where as the number of transactions increases, the cost to process each additional transaction does not increase. This way of measuring scalability was introduced by Roger Sessions of Objectwatch about 10 years ago and it holds true today.Also, we want to make sure as we plan our architecture, that we keep that cost per transaction as low as possible. (click slide).
When we talk about scalability, often linear scalability comes to mind. Linear scalability, relative to load, means that with fixed resources, performance decreases at a constant rate relative to load increases. Linear scalability, relative to resources, means that with a constant load, performance improves at a constant rate relative to additional resources. (click slide)The reality with scaling web applications is not a straight line, often times you will see a step function emerge. There is a base to this type of step function, to process zero requests in your web application still requires some infrastructure. But you will see that you can reasonably process a certain amount of load, before you need to “step up” to the next level. The problem with the step functions is that at a point, each additional “step” costs you more to take after a certain point (click slide).Our ideal scenario is that we have a relatively flat line where as the number of transactions increases, the cost to process each additional transaction does not increase. This way of measuring scalability was introduced by Roger Sessions of Objectwatch about 10 years ago and it holds true today.Also, we want to make sure as we plan our architecture, that we keep that cost per transaction as low as possible. (click slide).
Another point to make is that load and stress are not the same thing to a system. Load refers to the number of concurrent users currently using an application and there is a finite amount of load any system can reach due to computing and network resources. Stress refers to how the system behaves when it’s computing resources become constrained.
Let’s take a look at a the architecture of a typical web application. Now don’t laugh (although it is very tempting)! A slide similar to this was put up by Charlie Bell of Amazon.com and he said that this was the architecture that they started with and see where it has gotten them!Probably a large majority of the sites on the Internet run with this architecture and probably don’t need more than that. But there are obviously some scalability issues with this model. Can you name some of the things that could limit scale?NetworkCodeDatabaseWeb serverJust about everything, eh?
We are going to use a series of patterns to talk about how you could scale. These are not mutually exclusive and you will probably use more than one in your journey. But the first one is often overlooked, and that is an examination of your current environment to make sure you are not missing something obvious. I am amazed at what interesting things that people find when they examine their environment. Often times these examinations will increase performance, increase scalability and avoid thousands of dollars in hardware costs. Here are 2 examples that I personally have experienced:Debug=“True” in your web.config files. I would bet that there are several people in this room that have production ASP.NET applications that are running in Debug mode. Why because that is the default option! Among other things, the debug setting removes all time outs from your application, so if you have an error the request will live forever. Talk about the “retail” configuration in ASP.NET 2.0 and higher and suggest that people call their server admins during the break to get that setting changed!Having internal network traffic run on old switching equipment. I personally worked on a project where we could not figure out why our production traffic was much slower than our test environment (where we did load testing). Turns our that the web server and the database server were connected by a 10 MBPS switch misconfigured to run at ½ duplex. The cost of a 100 GB managed switch was about $100 at the time. We spent countless thousands of dollars in man hours trying to improve the performance
This pattern often manifests itself as “throw hardware at it”. What this pattern really entails is to leave your architecture the way it is, but
When we showed the first architecture slide, I think a lot of you cringed and thought “single point of failure” more than “scalability concern”. Most corporate web sites follow this pattern not out of scalability concern, but because they are very risk adverse and want their web sites to be “hyper available”. Web servers are relatively cheap in relation to the cost of business done on them in an hour (a beefy web server fully loaded and licensed will run you about $5000 and most eCommerce web sites do that amount of volume in an hour – so the cost of a redundant server is much less the business you would lose).Already in taking this first step we have a lot of architectural decisions to make. From an infrastructure standpoint we have:Are we going to use “software” load balancing or a dedicated hardware deviceHow do I determine that a web server has malfunctioned and
Load TestingDecide where the logic and processing balance on where’s the logic? Client/Server - this line is starting to get more and more blurry but there are some guideposts that we can throw out there.Deciding on where the data goes. What’s the balance between client side/server side. Think S+S and multiple clients and state management. Speed verses flexibility.Photo credit:http://www.flickr.com/photos/saschapohflepp/
Understand locality of referenceWhere are is caching already happening in your system right now? UI, BL, DBWhat caching can you get for free?Caching is a business decision!Photo credit: http://www.flickr.com/photos/peasap
Just as two objects can’t occupy the same space, two people can’t edit the same data with out a conflict.Build for multi-user concurrency right up front.Optimistic concurrency is NOT concurrency!Photo credit:http://www.flickr.com/photos/conanil/
We would like to wrap up this session with a discussion about the “scale later” philosophy. This ArcReady topic actually stemmed from the Web 2.0 session that we did last fall. There is a company called 37 Signals (based our of Chicago, IL) and they have a book called “Getting Real”. One of the tenants of their philosophy (not sure if we can call it a methodology) is that when you are creating a web application for public consumption, there are 1001 other things to consider before you have to worry about scalability. There notion is that you should worry about the application’s features and functionality much more than how your application is going to serve 1,000,000 people. There thought is that your should worry about getting your first customer and deal with the problem of 1,000,000 users if it comes to fruition.Story about a.com – 20 million users, 20,000 concurrent users