Apache Traffic Server HTTP Proxy Server Edge Performance

•Transferir como PPT, PDF•

5 gostaram•2,342 visualizações

Leif Hedstrom

Apache Traffic Server presentation, at Velocity 2010.

Tecnologia

Apache Traffic Server ,[object Object],[object Object],[object Object],[object Object],[object Object]

Agenda ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Traffic Server performance ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Traffic Server making Ops easy ,[object Object],[object Object],[object Object],[object Object],[object Object]

Example evaluation sheet ATS HAproxy nginx Squid Varnish mod_proxy Worker Threads Y N N N Y Y Multi-Process N Y Y N Y Y Event-driven Y Y Y Y N N? Plugin APIs Y N Y part Y Y Forward Proxy Y N N Y N Y Reverse Proxy Y Y Y Y Y Y Transp. Proxy N Y Y Y N N Load Balancer part Y! Y Y Y Y Cache Y N Y Y Y Y ESI soon N N Y Y N ICP Y N N Y N N Keep-Alive Y N Y Y Y Y SSL Y N Y Y N Y Pipeline Y N Y Y N Y

Traffic Server CDN configuration CONFIG proxy.config.http.server_port INT 80 CONFIG proxy.config.cache.ram_cache.size INT 512MB CONFIG proxy.config.url_remap.remap_required INT 1 map http://cdn.example.com/js http://js.example.com reverse_map http://js.example.com http://cdn.example.com/js map http://cdn.example.com/css http://css.example.com reverse_map http://css.example.com http://cdn.exampe.com/css map http://cdn.example.com/img http://img.example.com reverse_map http://img.example.com http://cdn.example.com/img /dev/sd2

Executive Summary ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Mais conteúdo relacionado

Mais procurados

Accelerating HBase with NVMe and Bucket CacheNicolas Poggi

Improving PHP Application Performance with APCvortexau

How to optimize CloudLinux OS limitsCloudLinux

RGW Beyond Cloud: Live Video Storage with Ceph - Shengjing Zhu, Yiming XieCeph Community

Breaking the Sound Barrier with Persistent Memory HBaseCon

Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex LauCeph Community

Caching with Memcached and APCBen Ramsey

Ceph Day Beijing - Ceph RDMA UpdateDanielle Womboldt

High Concurrency Architecture and Laravel Performance TuningAlbert Chen

Performance & Scalability Improvements in PerforcePerforce

Web Server Load BalancerMobME Technical

Apc presentationguestef8544

Apache Performance Tuning: Scaling UpSander Temme

hbaseconasia2017: hbase-2.0.0HBaseCon

Troubleshooting redisDaeMyung Kang

/* pOrt80BKK */ - PHP Day - PHP Performance with APC + Memcached for WindowsFord AntiTrust

Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...Danielle Womboldt

3 Ways to Improve Performance from a Storage PerspectivePerforce

CachingNascenia IT

hbaseconasia2017: Large scale data near-line loading method and architectureHBaseCon

Mais procurados (20)

Accelerating HBase with NVMe and Bucket Cache

Improving PHP Application Performance with APC

How to optimize CloudLinux OS limits

RGW Beyond Cloud: Live Video Storage with Ceph - Shengjing Zhu, Yiming Xie

Breaking the Sound Barrier with Persistent Memory

Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex Lau

Caching with Memcached and APC

Ceph Day Beijing - Ceph RDMA Update

High Concurrency Architecture and Laravel Performance Tuning

Performance & Scalability Improvements in Perforce

Web Server Load Balancer

Apc presentation

Apache Performance Tuning: Scaling Up

hbaseconasia2017: hbase-2.0.0

Troubleshooting redis

/* pOrt80BKK */ - PHP Day - PHP Performance with APC + Memcached for Windows

Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...

3 Ways to Improve Performance from a Storage Perspective

Caching

hbaseconasia2017: Large scale data near-line loading method and architecture

Destaque

Rit 2011 atsLeif Hedstrom

Apache con 2011 gdLeif Hedstrom

Traffic Server を使ってみたKazutoshi Fujimoto

Traffic server overviewqianshi

Improving GStreamer performance on large pipelines: from profiling to optimiz...Luis Lopez

Metro presentationAnshuman Tyagi

Destaque (6)

Rit 2011 ats

Apache con 2011 gd

Traffic Server を使ってみた

Traffic server overview

Improving GStreamer performance on large pipelines: from profiling to optimiz...

Metro presentation

Semelhante a Apache Traffic Server HTTP Proxy Server Edge Performance

WebCamp 2016: PHP.Алексей Петров.PHP at Scale: System Architect ToolboxWebCamp

Scalable Apache for Beginnerswebhostingguy

slides (PPT)webhostingguy

Introduction to Real Time JavaDeniz Oguz

VMworld 2013: How to Replace Websphere Application Server (WAS) with TCserver VMworld

Presemtation Tier OptimizationsAnup Hariharan Nair

OSMC 2023 | IGNITE: Serving Server-Side WASM with Web Awareness with NGINX Un...NETWAYS

Strata Singapore: GearpumpReal time DAG-Processing with Akka at ScaleSean Zhong

Pulsar summit asia 2021 apache pulsar with mqtt for edge computingTimothy Spann

Making it fast: Zotonic & PerformanceArjan

Clug 2011 March web server optimisationgrooverdan

SharePoint 2010 Boost your farm performance!Brian Culver

WE18_Performance_Up.pptwebhostingguy

Sun Web Server BriefMurthy Chintalapati

Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph Ceph Community

MNPHP Scalable Architecture 101 - Feb 3 2011Mike Willbanks

23 LAMP Stack #burningkeyboardsDenis Ristic

Hpe service virtualization 3.8 what's new chicago admJeffrey Nunn

Web Speed And ScalabilityJason Ragsdale

StrongLoop OverviewShubhra Kar

Semelhante a Apache Traffic Server HTTP Proxy Server Edge Performance (20)

WebCamp 2016: PHP.Алексей Петров.PHP at Scale: System Architect Toolbox

Scalable Apache for Beginners

slides (PPT)

Introduction to Real Time Java

VMworld 2013: How to Replace Websphere Application Server (WAS) with TCserver

Presemtation Tier Optimizations

OSMC 2023 | IGNITE: Serving Server-Side WASM with Web Awareness with NGINX Un...

Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale

Pulsar summit asia 2021 apache pulsar with mqtt for edge computing

Making it fast: Zotonic & Performance

Clug 2011 March web server optimisation

SharePoint 2010 Boost your farm performance!

WE18_Performance_Up.ppt

Sun Web Server Brief

Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph

MNPHP Scalable Architecture 101 - Feb 3 2011

23 LAMP Stack #burningkeyboards

Hpe service virtualization 3.8 what's new chicago adm

Web Speed And Scalability

StrongLoop Overview

Último

SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays

DevEX - reference for building teams, processes, and platformsSergiu Bodiu

The State of Passkeys with FIDO Alliance.pptxLoriGlavin3

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3

DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell

"ML in Production",Oleksandr BaganFwdays

Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan

A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3

Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3

Time Series Foundation Models - current state and future directionsNathaniel Shimoni

What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos

Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro

Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada

Apache Traffic Server HTTP Proxy Server Edge Performance

3. Origins of the software

4. Open Source benefits

5. Forward Proxy

6. Reverse Proxy

7. Intercepting Proxy

8. The concurrency problem

9. Solution 1: Multithreading

10. Problems with multi-threading

11. Solution 2: Event Processing

12. Problems with event processors

13. Traffic Server threads model

14. Plenty of FOSS Proxy Servers

15. Choosing an intermediary

16. SMP Scalability and performance

17.

18. HTTP/1.1 Features

19. Ease of use, extensible

20.

21. Example evaluation sheet ATS HAproxy nginx Squid Varnish mod_proxy Worker Threads Y N N N Y Y Multi-Process N Y Y N Y Y Event-driven Y Y Y Y N N? Plugin APIs Y N Y part Y Y Forward Proxy Y N N Y N Y Reverse Proxy Y Y Y Y Y Y Transp. Proxy N Y Y Y N N Load Balancer part Y! Y Y Y Y Cache Y N Y Y Y Y ESI soon N N Y Y N ICP Y N N Y N N Keep-Alive Y N Y Y Y Y SSL Y N Y Y N Y Pipeline Y N Y Y N Y

22. Operations is important!

23. Yahoo! Traffic Server Use Cases

24. Content Delivery Network

25. CDN on the Edge

26. Traffic Server CDN configuration CONFIG proxy.config.http.server_port INT 80 CONFIG proxy.config.cache.ram_cache.size INT 512MB CONFIG proxy.config.url_remap.remap_required INT 1 map http://cdn.example.com/js http://js.example.com reverse_map http://js.example.com http://cdn.example.com/js map http://cdn.example.com/css http://css.example.com reverse_map http://css.example.com http://cdn.exampe.com/css map http://cdn.example.com/img http://img.example.com reverse_map http://img.example.com http://cdn.example.com/img /dev/sd2

27. Common enemies of performance

28. TCP 3-way Handshake

29. Congestion avoidance

30. TCP Connection Management

31. Why Server Load Balancers?

32. Server Load Balancer

33.

34. trafficserver.apache.org

Notas do Editor

My name is Leif Hedstrom, I work for Yahoo, in the Edge Service organization. For the last 5 years, I’ve worked on building the Y! CDN, and other various edge services. If you have ever gone to a Y! web page, you most definitely have been using one or several of these edge services. Before we start, I’d like to point out that this presentation is also available as a white paper, which I’ve uploaded to the Velocity web site. If you can’t find it there, please email me, and I’ll make it available somewhere on our CDN  .
Traffic Server started as a commercial product, developed and sold by Inktomi way back in the days Yahoo! Acquired Inktomi in 2003, and while developing our own CDN, we “found” Traffic Server laying on the shelves. Dusting it off and porting to modern Linux, it immediately beat existing intermediaries hands down in our benchmarks, typically by 5x or more In 2009, Y! donated the Traffic Server source code to the Apache Software Foundation. In April of 2010, Apache Traffic Server became a TLP.
Before entering into the Apache community, the Y! version of TS was only running on 32-bit Linux. A direct benefit of being OpenSource gave us not only 64-bit support, but also ports to most common Linux distributions, FreeBSD, OpenSolaris and MacOSX. Performance has more than doubled since we released he code into the Apache Open Source community, and most of these improvements have come from external contributors.
* Before we go into details of what drives Traffic Server, and how we use it, let me briefly discuss the three most common proxy server configurations. * In a forward proxy, the web browser has to be manually (or via auto-PAC files etc.) configured to use a proxy server for all (or some) requests. The browser typically sends the “full” URL as part of the GET request. The forward proxy typically is not required to be configured for “allowed” destination addresses, but can be configured with Access Control List, or blacklists controlling what requests are allowed, and by whom. A forward proxy is typically allowed to cache content, and a common use case scenario is inside corporate firewalls.
A reverse proxy, aka a web accelerator, does not require the browser to cooperate in any special way. As far as the user (browser) is concerned, it looks like it’s talking to any other HTTP web server on the internet. The reverse proxy server on the other hand must be explicitly configured for what traffic it should handle, and how such requests are properly routed to the backend servers (aka. Origin Servers). Just as with a forward proxy, many reverse proxies are configured to cache content locally.
An intercepting proxy, also commonly called a transparent proxy, is very similar to a forward proxy, except the client (browser) does not require any special configuration. As far as the user is concerned, the proxying happens completely transparently. A transparent proxy will intercerpt the HTTP requests, modify them accordingly, and typically “forge” the source IP before forwarding the request to the final destination. Transparent proxies usually also implements traffic filters and monitoring, allowing for strict control of what HTTP traffic passes through the mandatory proxy layer. Typical use cases include ISPs and very strictly controlled corporate firewalls.
Alright, so lets talk about what problems a good HTTP (and/or proxy server) can solve. There are two primary concurrency problems for the server software developers to consider: How can the software handle tens of thousands of concurrent TCP connections? How can the software take advantage of modern multi-core CPUs? Commodity hardware today has 2, or 4 and even 8 or more cores in each server. * Additionally, while solving these two problems, we have to make sure we don’t introduce other resource starvations, for example, memory pressure.
Multithreading allows a process to split itself, and run multiple tasks in “parallel”. There is significantly less overhead running threads compared to individual processes, but threads are still not free. They need memory resources, and incur context switches. It’s a known methodology for solving the concurrency problem, and many, many server implementations relies heavily on threads. Modern OS’es have good support for threads, and standard libraries are widely available.
Deadlocks, where two threads (or processes) need to acquire the same two resources (e.g. locks), which can cause the application to completely stall (unrecoverable) Race conditions can occur, where the outcome is not deterministic, but depends on timing or scheduling of threads execution. Difficult to code and ‘get right’.
Events are scheduled by the event loop, and event handlers execute specific code for specific events This makes it easier to code for, there’s no risk of deadlock or race condition Can handle a good number of connections (but not unlimited) Squid is a good example of an event driven server.
Can not block on I/O or the entire event processor stall CPU intensive processing in an event handler can starve pending events for CPU (solution: Cooperative multi-tasking, where the event handler yields the CPU) Latency can become an issue if an event processor has to handle a lot of events Can only use one CPU / core
There are n worker threads per core, typically 2 or 3. This gives around 16 – 24 threads of execution threads on typical modern hardware, each running an event loop There are m I/O threads per disk spindle. This is used to deal with disk I/O outside of the worker threads, and the default is 4. A critical configuration decision here is to scale this appropriately, particularly if a “disk” is raided, and might have more than 1 spindle under the hood. There are also a small number of “helper” threads, to do tasks like accepting new connections, produce log output and stats aggregation and presentation * All threads share resources, such as RAM and Disk cache, configurations, stats and logs.
Traffic Server is obviously not the only HTTP intermediary in the Open Source community. Existing servers include Apache mod_proxy, Squid, NGINX, Varnish and Haproxy. This makes the task of choosing a Proxy server an interesting, but challenging task. You really need to understand your problem space, your requirements, and any restrictions (like, budget).
For me, there are three important areas to consider when choosing the proxy server (or probably, any other server for that matters): Performance and scalability Features Is it a good product for operations to manage, and for engineers to develop applications for? We’ll discuss these in details, but the goal for Apache Traffic Server is obviously to be smack in the middle of this Venn diagram. We’re not quite there yet.
SMP scalability, how well does the server scale with multiple CPUs and cores? Can it take advantage of all available CPU (and other) resources on modern HW? What sort of throughput can the server handle? Request / second, or Mbps / sec etc. How many concurrent users can the server handle? Can it cope with thousands or tends of thousands of concurrent users?
I wasn’t going to go into performance numbers, because out of context they are fairly useless, but here are some numbers from the Y! CDN and our lab. The Y! CDN is on some ~100 servers, most of which are idle most of the time. The reason for such a large deployment is because we cover most of the world, and also need to handle major outages as well as traffic spikes.
HTTP/1.1 is the standard HTTP protocol in use today, most browsers and servers use and support it. So should your intermediary server. There are many extensions and additional features an intermediary might want to do, for example, ICP for cache peering. Getting every corner case of HTTP/1.1 is difficult, particularly for an intermediary. There is a lot of semantics overloaded into the standard HTTP headers. In many cases a regular HTTP server might not need to worry about all of this, but an intermediary probably has to.
Easy to use, easy to configure, and generally easy to manage from an operational perspective. Resilience to crashes, corruptions and other operational nightmares. Extensible, making it easy to modify default behaviors, add functionality, extend with new code and features.
Traffic Server will monitor itself, and restart the main server process if something isn’t functional. Even through process restarts, the HTTP port is still being listened on, and new requests are queued up in the listen backlog. Most configurations can be modified and reloaded, without server restarts Adding plugins is easy, just drop them in place and restart the server.
This table is much to large to go into details, but it shows that there are a number of features to take into consideration when choosing an intermediary. This is not a complete list in any way, it is merely an example of what features you might want to consider for your proxy choices.
Traffic Server was designed from the groups up with flexible and advanced configuration, yet attention is always made to make the system as easy as possible to configure. The configuration defaults generally work really well for most setup. Getting started and evaluating traffic server is almost a zero-config experience.
Next I will discuss a couple of real life production use cases of using Traffic Server. Yahoo uses Traffic Server extensively, and long term all Y! traffic is expected to be fronted by one, or more, Traffic Servers.
The first, obvious use case is as a Content Delivery Network, or CDN. For Yahoo! this not only saves us a lot of money, but it also gives us much better control of our content, and allows for much easier integration with operational monitoring, debugging and support. Building a CDN is not difficult, but may not be financially feasible for small sites. However, it can save a significant amount of money for larger sites, but depends on size, CDN vendors etc. But probably more importantly, with your own CDN, you have full insights to what is going on with your CDN, and can debug and track problems much easier.
A CDN should preferably be on the edge, to provide static content close to the user. But at a minimum it needs to be distributed enough to deal with network and colo outages. The CDN should make network problems, server problems, or simple maintenance tasks (mostly) transparent to users. The CDN also should make it easy for operations to distribute content world wide. This is one reason why caching proxies are such a powerful tools for the Ops team. Finally, the CDN should hopefully save money for your company, using the cheapest possible distribution mechanism where possible.
Configuring Apache Traffic Server for a basic CDN is surprisingly straight forward. The defaults from the installation are mostly setup, and only minor tweaking are necessary. First, update the key-value config file, records.config, with a few updated setting. In this example, I modify the HTTP port we’re listening on, and how much memory to use for the RAM cache. Secondly, we’ll need to provide the mapping rules for the reverse proxy. In this example, we provide both a forward and reverse mapping rule for each origin server, the reverse mapping is only necessary if you expect to receive HTTP redirect responses, and in that case, Traffic Server will rewrite the Location: header accordingly. Finally, you need to specify disk storage for the cache. It can be one or more raw partitions or directories on a file system.
Before we go into the next use case scenario, lets discuss a couple of common problems that can severely affect your web sites performance. There are several reasons why a Web page might be slow to render, three common problems that I personally have to deal with are TCP 3-way handshake, TCP congestion control, and DNS lookups. We’ll discuss the first two here, and explain what we’ve done at Yahoo! using Traffic Server to alleviate these two problems.
In TCP, every new connection has to go through a setup phase, typically referred to as the 3-way handshake. As you can see from the picture above, this means that there’s a full round-trip worth of latency before the client can even send the first HTTP request to the server. Since latency introduced by 3-way handshake is associated with network round-trip time, it goes without saying that the longer the distance between the client and the server, the longer the latency until the HTTP request can be sent. The solution to this problem is generally to use HTTP keep-alive between the client and server, which is a major reason why it is so critical for Traffic Server, and other intermediaries, to be able to handle tens of thousands of concurrent connections.
TCP congestion avoidance is a mechanism for TCP to avoid congestion on our networks. There are many different implementations of TCP congestion control mechanisms, but the general idea is to start out ‘slow’, and increase the amount of packets we have outstanding on the wires before we wait for an acknowledgement from the receiver. This is why sometimes congestion control is referred to as “slow start”. Similar to 3-way handshake, this will introduce latencies which are directly related to the round-trip time between client and server. Keep-alive generally doesn’t solve this, since by default (sometimes configurable) an idle connection will force the restart of another slow start.
To solve not only the 3-way handshake latency problems (which is easy), but to also solve the congestion avoidance latency problems, we’ve deployed Traffic Server farms all over the world. Users will always connect to a TS farm that is close, preferably in the 10’s of ms latency. For connections over long distances, the Traffic Servers will keep persistent connections to other servers (which could be another Traffic Server, or any other HTTP server that supports Keep-Alive). These connections are reused and share between many users, and the congestion avoidance resets are avoided because these server to server connections are generally kept active all the time.
Improve system reliability and availability by routing HTTP (or other TCP protocols) to an appropriate and available server. SLB makes it easy to scale your server farms, by simply adding more servers behind the rotations. Ease of maintenance, taking servers in and out of “rotation” is easy, and requires little work from the operations team. Outages are handled automatically. Better performance by routing requests to an “optional” backend server. This can increase cache localities, reduce memory usage (less data on each box), user stickiness (where a particular user always hits the same backend server). * Hardware SLBs are expensive
Yahoo has begun work on replacing hardware SLBs with commodity servers running Traffic Server. These servers are a magnitude, or more, less expensive than typical hardware SLB vendors. Our custom plugin (hopefully soon to be Open Sourced), can do more interesting HTTP routing decision. It’s not a general load balancer solution, but focusing on HTTP (and HTTPS) we can optimize and improve the feature set for what we use the most.

Apache Traffic Server HTTP Proxy Server Edge Performance

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (6)

Semelhante a Apache Traffic Server HTTP Proxy Server Edge Performance

Semelhante a Apache Traffic Server HTTP Proxy Server Edge Performance (20)

Último

Último (20)

Apache Traffic Server HTTP Proxy Server Edge Performance

Notas do Editor