This document discusses high-performance HTTP front-ends. It covers why HTTP is commonly used, performance expectations and achievements including supporting 500k concurrent users and 100k requests/second on a single server. Availability is achieved through routing protocols and redundancy rather than stateful failover. Future improvements may come from protocols like SPDY that allow multiplexing requests to reduce latency.
Building high traffic http front-ends. theo schlossnagle. зал 1
1. High-performance
Robust
HTTP
Front-ends
/ tips, tricks and expectations
Saturday, April 23, 2011
2. Who am I? @postwait on twitter
Author of “Scalable Internet Architectures”
Pearson, ISBN: 067232699X
Contributor to “Web Operations”
O’Reilly, ISBN:
Founder of OmniTI, Message Systems, Fontdeck, & Circonus
I like to tackle problems that are “always on” and “always growing.”
I am an Engineer
A practitioner of academic computing.
IEEE member and Senior ACM member.
On the Editorial Board of ACM’s Queue magazine.
2
Saturday, April 23, 2011
3. Agenda
• Why only HTTP?
• HTTP-like protocols
• Performance
• Availability
Saturday, April 23, 2011
4. HTTP
• Why only HTTP... it’s what we do.
• User-based, immediate, short-lived
transactions occupy my life.
• So, not just HTTP.
• HTTPS
• SPDY (... we’ll get to this)
Saturday, April 23, 2011
5. Performance
• ATS (Apache Traffic Server)
• supports SSL
• battle-hardened codebase
• very multi-code capable
• Varnish
• VCL adds unparalleled flexibility
• no SSL!
• nginx
• I don’t see much of this out on the edge
Saturday, April 23, 2011
6. Performance Expectations
• from a single server, you should be able to:
• support 500k concurrent users
• this is only 40k sockets/core
• push in excess of 100k requests/second
• this is only 9k requests/core*second
• push close to 10 gigabits
• this is why 10G was invented
Saturday, April 23, 2011
7. Performance Achievements
• Good load balancers achieve this performance
• with dual socket Westmere processors,
we’re able to achieve in
software on
general purpose hardware
what was only possible in hardware ASICs.
• ATS and Varnish can do this today.
Saturday, April 23, 2011
8. The Basic Rules: Content
• You must serve content from cache
• Your cache should fit in memory
• If it does not, it should spill to SSD,
not spinning media.
Saturday, April 23, 2011
9. The Basic Rules: CPU
• You must cache SSL sessions
• SSL key negotiation is expensive.
• SSL encryption is not*
• Common cases must not cause state on the firewall.
• It’s hard enough to serve 150k requests/second.
• You will spend too much time in kernel in
iptables, ipf, or pf.
• allow port 80 and port 443.
• enable SYN flood prevention
* crypto obviously costs CPU; symmetric crypto is relatively cheap
Saturday, April 23, 2011
10. The Basic Rules: Network
• You must not run a stateful firewall in front
• too expensive
• too little value
• You must be directly behind capable router(s)
• expect anywhere from
1MM to 20MM packets per second
• we need to run BGP for availability
Saturday, April 23, 2011
11. Availability
• We learned in the performance section:
• 1 machine / 10Gbps uplink performs well enough
• We need redundancy:
• Linux HA?
• VRRP/HSRP?
• CARP?
• No...
Saturday, April 23, 2011
12. Availability: Constraints
• Client TCP sessions are relatively short lived.
• The web is a largely idempotent place.
• Clients are capable of retrying on failure.
• This means:
• forget stateful failover.
• focus on availability for new connections.
Saturday, April 23, 2011
13. Availability: Setup
• You are behind a capable router (it was a rule)
• Use routing protocols (BGP) to maintain availability.
BGP
10.1.0.0/24 10.1.1.0/24
10.1.0.0/23 10.1.0.0/23
Saturday, April 23, 2011
14. Working Stacks
• Linux (OS/TCP stack) • Illumos (OS/TCP stack)
• Varnish (HTTP) • ATS (HTTP/HTTPS)
• Quagga (BGP) • Quagga (BGP)
Saturday, April 23, 2011
15. Future!
• This stuff is fast.
• In the end, we’re not looking for faster servers,
we’re looking for improved user experience.
• Enter SPDY
• Google’s multi-channel HTTP super-protocol
• Allows multiplexing of concurrent HTTP(like)
request/response on a single TCP session.
• Defeats slow startup
• Allows for content prioritization on server
Saturday, April 23, 2011
16. Future: my thoughts
• SPDY is relatively simple to implement on the server
• SPDY is very very hard to leverage on the server
• If ATS implemented SPDY in and out
• and provided a robust configuration language
to leverage it
... the future would be today.
Saturday, April 23, 2011
17. Thank you.
• Thank you Олег Бунин
• Thanks to the Varnish and ATS developers.
• Спасибо.
Saturday, April 23, 2011