2. Outline
Exove in brief
What is Cache Control and how does it work?
Easy case: Jatkoaika.com
Anonymous users, high read/write ratio
Hard case: Demi.fi
Autheticated users, low read/write ratio
Different case: Tekla Campus
Using Cache Control with a CDN
Discussion
4. WHAT IS CACHE CONTROL
AND HOW DOES IT WORK?
drupal.org/project/cache_control
5. What is Cache Control?
Module for integrating your site with Varnish or
some other HTTP cache
Sets appropriate Cache-Control headers in HTTP
responses from Drupal.
Supports also content purging
Automatic purges for e.g. node updates, hooks for
custom purges
Comes with an admin UI for selecting cacheable
menu router paths and a VCLfile for Varnish
6. How does it work?
Varnish checks if requested page is cached
If it is, Varnish sends it to user’s browser (also for
authenticated users!)
If it isn’t, pass the request to Drupal, execute the page
load as anonymous user and cache the response in
Varnish
Process the response in user’s browser
For anonymous users, show the page as is
For authenticated users, generate personalized parts in
anAJAX back-end (get_components) and inject the
results on the page
7. “Personalized content?”
You can enable Cache Control for any Drupal
block – the block will be generated for
authenticated users in the get_components
back-end
Using Cache Control’sAPI, you can “tag” any
part of the page to be generated for in the
get_components back-end
8. Benefits of Cache Control
Only the needed parts are loaded: The back-
end is significantly less burdened
All personalized parts of the are loaded in a
single request
The user is given something to look at while the
hard parts of the page are being loaded – the
site feels faster
9. What’s the catch?
Building high-performance sites is a complex
matter. Cache Control is not a magic bullet to
solve all your performance issues
While developing, you have to “think in Cache
Control” or you’ll be in a world of trouble
You will most likely end up writing at least some
custom code and spending time wondering why the
site behaves differently when Cache Control is
enabled
10. What about ESI?
ESI (Edge Side Includes) is a partial loading
technique supported by Varnish and some CDNs,
e.g.Akamai
It basically makes Varnish do the partial page
loading
Varnish first fetches the common version from cache
Then it looks though the page to see any ESI markup
Then it loads all the ESI marked parts of the page from
cache or from Drupal
11. How does Cache Control
differ from ESI?
ESI needs to wait until the whole page is loaded
before giving anything to the user
ESI loads all the portions of the page (still in D7,
this might change in D8) in separate HTTP
requests, thus burdening the server with even
more bootstraps than without any cache
13. Jatkoaika.com
Jatkoaika.com is the leading ice hockey site in
Finland
200 000 unique visitors and 1.6M page loads per
week
Page loads in Drupal are almost exclusively
done by anonymous users
Content is read a lot more often than
written, making the site an ideal use case for
Cache Control
14.
15. Jatkoaika.com – Setup
Drupal, MySQL, SOLR, memcached, Varnish –
all running on one server
Cache Control enabled for all content pages
(nodes, taxonomy terms, front page) with
different TTLs – no custom code required
Server loads are minimal
17. Demi.fi
Demi.fi is the community around the Demi
magazine, targeted to teenage girls
2.8M weekly page views
Most page loads done by authenticated users
1 300 – 1 500 logged-in users during busy hours
The users generate a lot of content (forum
posts, comments, etc.)
Keeping the cache up to date is a challenge
18.
19. Demi.fi – Setup
Drupal, MySQL (Percona), SOLR, MongoDB, nginx
+ php-fpm, memcached, Varnish – all running on
(almost) one server
Cache Control enabled for almost all user-facing
pages and someAJAX backends as well
Alot of personalized components per page, putting strain
on the get_components back-end
Quite a lot custom code required in making the site
compatible and triggering cache purges when needed.
Server loads are significant but mostly tolerable
20. Demi.fi – Strategy
Avoid Drupal bootstrap and theming
Cache Control: try to keep as much content in
Varnish cache as possible
Fast JSON-based backends for data that changes
often (e.g. forum topic listings): offload theming to
users’browsers. Use Cache Control to cache the
results with shortTTL(30 secs or so)
Use fast storage: SOLR for Views, MongoDB for
field storage, memcached for cache.
Get a good sysadmin
21. Demi.fi – Lessons Learned
Cache Control’s get_components back-end needs
to be fast
Cache Control now supports MongoDB as storage
backend
Cache Control’s front-end needs to be fast
We had to rethink how to manipulate the page that has
lots of personalized content
Continuous cache purging can also be a
performance issue
Varnish 3.0-style bans take up a lot of resources, use
purges (2.0-style bans) instead
22. Demi.fi – More Lessons
Building high-performance sites is hard, and it gets
harder if you don’t take performance into account
from the very beginning
This includes design: be aware of the performance cost of
displaying a certain piece of content on a page, identify
and mitigate potential performance killers
Cache Control is far from perfect and doesn’t alone
solve your problems
Ironing out small glitches with e.g. cache purging
can be a lot of work
–
24. Tekla Campus
Tekla Campus is an e-learning tool and
community for engineering and construction
students
Users come from all over the world
Almost all of them are authenticated
Not that much user-generated content,
moderate amount of personalized content for
logged-in users
25.
26. Tekla Campus – Setup
The site is hosted in Finland, but user base is
spread all over the world
To mitigate latency, we needed a CDN solution
Turns out Fastly CDN uses Varnish, so we
decided to give it a go
Cache Control plays nicely with Fastly, even
cache purges work out of the box
Fastly even allows you to upload your own VCL
28. Summary
Cache Control is a module for integrating your
site with e.g. Varnish. It works for both
anonymous and authenticated users
It can help make your site a lot faster
It can be easy or hard, depending on the complexity
of your site
You can also use it to help with geographical
distribution of your site
29. THANK YOU!
WHAT DID YOU
THINK?
Locate this session at the
DrupalCon Prague website:
http://prague2013.drupal.org/schedule
Click the “Take the survey”
link
Notas do Editor
-CC: not going to go in details
-60 people, most of which developers-JanneKalliola is the chair of Business and Strategy track
-address to the project page, check out the code if you like
-I’ll be talking about Varnish, because it’s most familiar to us (tested also with nginx cache, doesn’t support purges)-purges: consider listings on e.g. the front page (+compare purges with the Columbia Law School Tag! session from before)-You can also select TTL per path
-context switch made if page is 1) set cacheable in the ui 2) accessible by anonymous user 3) other details not worth mentioning
-some technical details: for each personalized component, store function and arguments that are needed to generate the data, also html id
some bootstrapping + heavy theming is avoidedget_components still requires bootstrap, run with the current user session (NOT CACHED) in some cases, the site feeling faster is really just a feeling
- complex matter: using Varnish isn’t the only thing you need to do- ”thinking in cache control”: what’s going to be personalized, can something be done differently etc. Do this as early as possible!- custom code: call hook_cache_control for tagging components (unobtrusive, you can disable cc at any time because of this), load some js and css. Why site looks different? Because Drupal
-when we released Cache Control, one of the first questions was about ESI
-might have changed, haven’t checked for a while
-forum notDrupal-cache control disabled for admins
news, teams, results, statistics…
-no purges (except for automatic ones)-content propagation mostly handled with low TTLs (front page etc.)
-around since 1998 or so, this is the fourth incarnation, huge migration, 250 000 registered users, millions of nodes (threads, community pages, blog posts)-teenage girls really let you know if something’s wrong-every page has a lot of personalized components, which is a challenge
-forum listing, json backend-personalized content on the right
-almost one server: php-fpm partly offloaded to another server-custom code: js/css loading, purging, redirects so that purges work (Cache Control at its worst)
-json-backend: not directly related to cc, but is an example of the fact that cc alone doesn’t solve your problems-mention front themer?-sysadmin: most Drupaldevs are not in their comfort zone with Varnish, db optimization, server configuration etc.
-mongodb panic rewrite-varnish loads-form cache + memcached -> problems with our number of forms (space just runs out) -> move to mysql -> loads skyrocket
-glitches: redirects, messages, css/js-don’t really know if this can be avoided due to the way Drupal handles things
-support forum is probably the most user-generated content there is
-very simple: lessons to use the tools
-in addition,fastly has good coverage of nodes throughout the world-other options: Akamai – trouble with POST requests (?)-seems happy, but is not as happy as Jatkoaika: some custom code was needed
it does this by manipulating cache-control HTTP headers (integration with Varnish), caches anonymous pages, personalizes on AJAXeasy = not that much personalized content (or personalized content), few purges, hard = custom code, personalized content, lots of purges, small js/css glitchesgeographical distribution = together with a suitable CDN