O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

When dispatcher caching is not enough by Jakub Wądołowski

1.315 visualizações

Publicada em

When dispatcher caching is not enough by Jakub Wądołowski

Publicada em: Tecnologia
  • Seja o primeiro a comentar

When dispatcher caching is not enough by Jakub Wądołowski

  1. 1. When dispatcher caching is not enough… Jakub Wądołowski Senior Systems Engineer @ Cognifide
  2. 2.  The What  The Why  The How Agenda
  3. 3. The What
  4. 4. It all started in 2012… www.flickr.com/photos/nasahqphoto/16327416694
  5. 5. To be perfectly honest, initially it was rather like that… www.flickr.com/photos/garryknight/5703519506
  6. 6. The client  EU pharmaceutical company  75 offices across the globe  Over 40 000 employees  Medical products available worldwide (180+ countries) www.flickr.com/photos/worak/2258271659
  7. 7.  Country specific brochureware websites for medical products  iPad app for sales representatives  Single point for content entry  Multiple integration points (SSO, user/device authentication, etc.)  CQ 5.5, upgrade to AEM 6.1 in progress Requirements
  8. 8. Main components Brochureware website iPad app AEM Authoring
  9. 9.  Single datacenter in London (Rackspace)  REST-like API for iPad app  Integrations with local and remote services Logical architecture
  10. 10. Initially it was just Spain, Argentina and Sweden
  11. 11. 6 months later the number of countries was tripled
  12. 12. To finally reach 21 and it is still not over
  13. 13. The Why
  14. 14. “Our team in Argentina complains that the app feels slow. They can’t download presentations sometimes. Could you please investigate that?” Mr B. www.flickr.com/photos/r4vi/8640618489
  15. 15.  Latency, latency, latency…  Way too high round trip times (RTT)  Timeouts  Broken streams  Connection resets  Poor Internet connections in some areas Problems
  16. 16. Solutions
  17. 17. It has been decided that Hong Kong is the way to go for us
  18. 18. There’s over 10 000 km between London and Buenos Aires…
  19. 19. …which is nearly the same distance as between London and Hong Kong
  20. 20.  Client-server problems became server-server ones  How we’re going to sync all the changes (both ways)?  What about deployments?  Do we have enough licenses?  What’s the best way to implement content sharding?  How long it will take to implement all of these things? When initial excitement was gone…
  21. 21. www.flickr.com/photos/geishaboy500/2496995573 PoC conclusion
  22. 22.  We can’t just cache more on dispatcher  This is a very well known problem  Let’s use the right tool to solve the problem the right way  Content Delivery Network (CDN) is the way to go! The road to CDN
  23. 23. “(…) CDN is a large distributed system of servers deployed in multiple data centers across the Internet. The goal of a CDN is to serve content to end-users with high availability and high performance. CDNs serve a large fraction of the Internet content today (…).”, Wikipedia CDN definition
  24. 24. AEM + CDN
  25. 25. www.flickr.com/photos/pictures-of-money/16678590844 CDN, huh?
  26. 26. That's not necessarily true nowadays… www.flickr.com/photos/halfrain/14410890555
  27. 27.  Pay-as-you-go model  Powered by Varnish  Highly customizable (ability to upload your own VCL)  150 ms to purge – globally  ~5 sec to change a config through the web API  SSD powered servers connected to T1 networks  Real-time insight what’s happening (graphs, logs, etc)  Great support Why Fastly?
  28. 28. https://www.fastly.com/network
  29. 29. Still not convinced?
  30. 30. The How
  31. 31. Ok… how should I start? www.flickr.com/photos/kleuske/8004416109
  32. 32. www.flickr.com/photos/martinbamford/5638834940 The logs!
  33. 33.  grep, awk, sed - all of these are your friends  Count your requests  Leverage the power of log monitoring tools (ELK, Splunk, etc.)  Plan your content structure carefully Logs and content structure
  34. 34. Look for patterns www.flickr.com/photos/wwarby/4915777722
  35. 35.  If it is a GET request and starts with /bin/myapp/v[1-2]/a_string.json then it is X  All requests to /content/something/*/_jcr_content.zip end with 302 to /some/path/to/file.zip Request patterns
  36. 36. Assign these patterns to multiple buckets www.flickr.com/photos/ddebold/15991919514
  37. 37.  Public content  Private content  Content available for authorized users only Content groups/buckets
  38. 38.  Reverse HTTP proxy  In-memory time based cache  Blazing-fast  Big “state” machine  Varnish Configuration Language (VCL)  Full control of HTTP flow Varnish in 1 slide!
  39. 39.  Cacheable methods: GET, HEAD  Cacheable response codes:  200, 203  300, 301, 302  404, 410  “Cache-Control: private” if not defined otherwise General caching rules
  40. 40. Let’s start with the iPad app www.flickr.com/photos/pestoverde/15048774061
  41. 41.  3 request types  REST API request  Presentation request (ZIP files)  Image request iPad – HTTP flows
  42. 42.  2 content groups  Private  For all authorized users  8 request patterns  TTL varies from 10 minutes to 7 days  35/65 dynamic/static content (frequently changing JSON files vs PDFs/PNGs)  All REST API responses are private iPad app content
  43. 43.  Private content is cacheable  What makes HTTP response private?  It is tied up with user session – in other words HTTP request carried unique authorization cookie Private content
  44. 44. www.flickr.com/photos/hyku/368912557 Is it really safe to cache that type of content?
  45. 45.  Varnish cache is a key-value store  Default key: req.url + req.http.host  req.url + req.http.host + sessionId = private cache space - voila! Private cache
  46. 46. Dynamic means uncacheable? www.flickr.com/photos/gsfc/7402445224
  47. 47.  Cache usually brings some trade-off  Updates won’t be instantaneous  TTL has to expire, or  a purge request has to be triggered  CDN is the way to go if you accept this delay Dynamic content
  48. 48. Content purging www.flickr.com/photos/librariesrock/13522859053
  49. 49.  Fastly exposes purge REST API  Purge URL  Purge Key  Purge all assets marked with special “label”  https://www.fastly.com/blog/surrogate-keys-part-1  Purge All  Purge vs Soft Purge  https://www.fastly.com/blog/introducing-soft-purge Content purging
  50. 50. Results www.flickr.com/photos/89228431@N06/11322953266
  51. 51.  Hit ratio: 49,9%  Cache coverage: 66,1%  Requests: 89K iPad app statistics
  52. 52. What about the speed? www.flickr.com/photos/129341635@N02/16609174727
  53. 53.  Presentation downloads  Europe: up to 21% faster  South America: up to 50% faster  APAC: up to 83% faster  API responses  Europe: up to 60% faster  South America: up to 40% faster  APAC: up to 55% faster Speed boost
  54. 54. Issues? www.flickr.com/photos/giuseppemilo/15414290956
  55. 55. Crimes against cacheability www.flickr.com/photos/alancleaver/4121423119
  56. 56.  Adding Set-Cookie to every response  Auth cookie is not revoked in the browser after logout  TBD Crimes against cacheability
  57. 57. “iPad app performance is much better now! But we still have some issues with authoring. It is really slow in some countries.” Mr B. www.flickr.com/photos/r4vi/8640618489
  58. 58.  I was rather skeptical  Way too dynamic to be considered cacheable?  What kind of improvement we might get? 5-10%? Is it worth it?  Don’t know how, but it has been decided to roll things out  CDN in front of authoring?
  59. 59.  3 content groups  36 request patterns  TTL up to 14 days  Mostly dynamic + static web GUI resources  A lot of assets common for every logged in user CDN + AEM Author Request pattern Cachable? /apps/cq/core/content/login/.*(png|jpg|css|js)$ YES /libs/cq/i18n/dict.en.json YES /etc/.*.(png|woff|css|js|jpg|gif|ttf|svg|eot|swf|ico)$ YES /cf#/content/myapp/en/about.html NO
  60. 60. Authorized only! www.flickr.com/photos/rudyjuanito/5170435542
  61. 61.  CDN knows nothing about user session  The goal is to cache common content for successfully authorized users  Authorize them at the edge! Authorize at the edge
  62. 62. Auth tokens www.flickr.com/photos/cfortier/426610972
  63. 63.  2nd auth cookie (token), readable by CDN  HMAC function  2 auth cookies are tied together  Reference implementation: https://github.com/fastly/token-functions  Private key shared between AEM and CDN  CDN can evaluate user session without request to AEM Auth tokens
  64. 64. 96,3% www.flickr.com/photos/spacexphotos/16169087563
  65. 65.  Hit ratio: 96,3%  Cache coverage: 45,7%  Requests: 83K Author statistics
  66. 66.  Adding Set-Cookie to every response  Auth cookie is not revoked in the browser after logout  “Vary: Cookie” usage Crimes against cacheability
  67. 67. www.flickr.com/photos/aushiker/20369395093 What about deployments?
  68. 68.  Does every deploy involve full CDN cache purge?  Nope!  iPad presentations are packaged in a ZIP file and versioned  Majority of authoring related cacheable assets stay untouched between deployments AEM deployments
  69. 69. Summary www.flickr.com/photos/andrewhurley/6254409229
  70. 70.  Traffic growth is no longer an issue  Over 2 TB monthly reaches CDN servers  ~5,5 million HTTP requests per month  just ~570 GB was passed through to AEM  License, budget and time savings  More than satisfying results  Very small changes in the AEM app itself  Happy client  Summary
  71. 71. jakub.wadolowski@cognifide.com github.com/jwadolowski twitter.com/jwadolowski linkedin.com/in/kubawadolowski/en

×