3. WHY NOT DOING CACHING
IS BAD?
•
Keep executing the same code with the same data
•
Waste computing power getting the same result
•
That power is probably generated by burning coal*
•
Burning stuff produces tons of CO2**
* it most likely is not
** probably a smaller unit of mass
4. Too much CO2 will make
THE EARTH EXPLODE*
* based on pure speculation
5. WHY SHOULD YOU CARE?
•
Your web apps will become WAY faster
•
Users and search engines will like you MORE
•
You will use A LOT less hardware resources
2
CO
and/or save
$$$
•
You will generate LESS
•
The Earth will NOT explode and/or you’ll have more $$$
•
Women like people who save the world and/or have $$$
•
And lots of other stuff*
* 0 or greater amount of other stuff
7. WHY YOU SHOULD AVOID USING TTL
•
You might use obsolete data
•
Your server might get a cache stampede and go down
•
You should PUSH the fresh data in your cache as soon as you have it,
BEFORE the old one has expired from the cache
17. HOW DO I CACHE THINGS?
1. Create a Memcached instance
$memcached = new Memcached;
$memcached->addServers( $memcachedServers );
2. Put data in
$memcached->set( $key, $value, $expireAt );
3. Get data out
$memcached->get( $key );
19. HELPFUL TIPS
•
It’s best if you cache the final result of an operation rather than the entry data
•
You should always have a fallback if you get a cache miss
•
Try to avoid flushing the entire cache, use clever key names instead
•
Use Memcached::getAllKeys() to help you manage/release/update data
•
Use Memcached::stats() to help you improve efficiency
•
Have a warmup script!
20. WHAT TO CHECK IN STATS()
…
…
[“get_hits”]=>int(110825125)
[“get_misses”]=>int(17396765)
[“evictions”]=>int(0)
…
…
22. VARNISH IS:
•
A caching HTTP reverse proxy
•
Really, really really FAST
•
Usually limited by the speed of the network
•
Has decent flexibility with VCL configuration language
24. NICE SPEED
Now lets see how to use Varnish effectively on my very dynamic site
25. COMMON PROBLEMS TO OVERCOME
•
My pages are mix of highly dynamic sections and mostly static stuff, and Varnish supposedly only caches
whole pages
•
I need to control/flush/refresh the cache without stoping/starting/killing/rebooting/pulling the cord/
assaulting the datacenter and I prefer to do it from within my app
•
My visitors have unique stuff
•
Sessions
•
Cookies
•
Statistics and tracking visitors
26. ABOUT ESI
•
Edge Side Includes or ESI is a small markup language for edge level
dynamic web content assembly. The purpose of ESI is to tackle the
problem of web infrastructure scaling.
<HTML>
<BODY>
…
<esi:include src=“/esi/private/recentproducts“/>
…
</BODY>
</HTML>
27. Doesn't change at all
session specific
1minute
session specific
24h
Doesn't change at all
2-4minutes
1 hour
29. HOW DOES IT WORK?
pipe
vcl_recv
vcl_pipe
pass
lookup
vcl_pass
vcl_hash
Backend1
pass
Client request
vcl_hit
vcl_miss
vcl_fetch
vcl_deliver
vcl_error
pipe
fetch
Backend2
30. vcl_recv
•
First checkpoint when a request arrives and is parsed
•
We must decide whether to lookup, pass or pipe the request
•
We can choose a backend to use
•
We have the req object
•
Definition of PURGE, BAN or REFRESH like requests is here
•
We can set a header in the req object to tell our backend the request is from varnish
31. set req.backend = default;
set req.http.X-Varnish-Handshake = “1”;
set req.http.X-Forwarded-For = client.ip;
!
if (req.url ~ "/esi/") {
set req.http.X-Varnish-Esi = regsub(req.url, ".esi/(w+)/.*", "1");
remove req.http.Accept-Encoding;
}
if (req.request != "GET" && req.request != "HEAD") {
# We only deal with GET and HEAD by default
return (pass);
}
if (req.http.Cookie !~ “PHPSESSID="){
call generate_session;
}
return (lookup);
32. WAIT, WHAT?
sub generate_session {
C{
char uuid_buf [50];
generate_uuid(uuid_buf);
VRT_SetHdr(sp, HDR_REQ,
"030X-Varnish-Fake-Session:",
uuid_buf,
vrt_magic_string_end
);
}C
!
if (req.http.Cookie) {
set req.http.Cookie = req.http.X-Varnish-Fake-Session + "; " + req.http.Cookie;
} else {
set req.http.Cookie = req.http.X-Varnish-Fake-Session;
}
}
33. WAIT, WHAT?
sub generate_session {
C{
char uuid_buf [50];
generate_uuid(uuid_buf);
VRT_SetHdr(sp, HDR_REQ,
"030X-Varnish-Fake-Session:",
uuid_buf,
vrt_magic_string_end
);
}C
!
if (req.http.Cookie) {
set req.http.Cookie = req.http.X-Varnish-Fake-Session + "; " + req.http.Cookie;
} else {
set req.http.Cookie = req.http.X-Varnish-Fake-Session;
}
}
34. C{
#include <stdlib.h>
#include <stdio.h>
#include <time.h>
#include <pthread.h>
!
static pthread_mutex_t lrand_mutex = PTHREAD_MUTEX_INITIALIZER;
!
void generate_uuid(char* buf) {
pthread_mutex_lock(&lrand_mutex);
long a = lrand48();
long b = lrand48();
long c = lrand48();
long d = lrand48();
pthread_mutex_unlock(&lrand_mutex);
sprintf(buf, "PHPSESSID=%08lx%04lx%04lx%04lx%04lx%08lx",
a,
b & 0xffff,
(b & ((long)0x0fff0000) >> 16) | 0x4000,
(c & 0x0fff) | 0x8000,
(c & (long)0xffff0000) >> 16,
d
);
return;
}
}C
35. HOW DOES IT WORK?
pipe
vcl_recv
vcl_pipe
pass
lookup
vcl_pass
vcl_hash
Backend1
pass
Client request
vcl_hit
vcl_miss
vcl_fetch
vcl_deliver
vcl_error
pipe
fetch
Backend2
36. vcl_hash
•
Generates the hash through which Varnish looks up an object
•
We have the req object
•
We can make certain objects unique in the cache based on
something more than just the url - like a session cookie.
38. HOW DOES IT WORK?
pipe
vcl_recv
vcl_pipe
pass
lookup
vcl_pass
vcl_hash
Backend1
pass
Client request
vcl_hit
vcl_miss
vcl_fetch
vcl_deliver
vcl_error
pipe
fetch
Backend2
39. vcl_fetch
•
Takes control when a response from the backend is fetched and parsed
•
We have the req and beresp objects
•
A good place to sanitise the backend response and control TTL
•
Removal of Set-Cookie header is a good practice here
•
Add helper headers to the cached object for the ban lurker
•
We can choose to deliver or hit_for_pass here
40. beresp.ttl
Before Varnish runs vcl_fetch, the beresp.ttl variable has already been set to a value. It will
use the first value it finds among:
!
•
The s-maxage variable in the Cache-Control response header
•
The max-age variable in the Cache-Control response header
•
The Expires response header
•
The default_ttl parameter
41. set beresp.http.X-Url = req.url;
set beresp.http.X-Host = req.http.host;
set beresp.http.X-Varnish-Session = regsub(req.http.Cookie,"^.*?PHPSESSID=([^;]*);*.*$", “1");
if (beresp.status != 200 && beresp.status != 404) {
set beresp.ttl = 15s;
return (hit_for_pass);
}
if (beresp.http.Set-Cookie) {
remove beresp.http.Set-Cookie;
}
if (beresp.http.X-Varnish-Esi == "1") {
set beresp.do_esi = true;
}
if (req.url ~ ".(jpg|jpeg|gif|otf|png|ico|css|zip|tgz|gz|rar|bz2|pdf|txt|tar|wav|bmp|rtf|js|flv|swf|scripts)$"){
set beresp.ttl = 180m;
}
return (deliver);
42. HOW DOES IT WORK?
pipe
vcl_recv
vcl_pipe
pass
lookup
vcl_pass
vcl_hash
Backend1
pass
Client request
vcl_hit
vcl_miss
vcl_fetch
vcl_deliver
vcl_error
pipe
fetch
Backend2
43. vcl_deliver
•
Takes control just before a response is sent to the client
•
We have the req and resp objects
•
Executes after hit, miss and fetch, hit_for_pass or pass (but not pipe)
•
Removal of all headers we set during the VCL flow is a good idea here
•
We can also add headers here that should go to the client, but shouldn’t be in
the cache
44. if (req.http.X-Varnish-Fake-Session) {
call generate_session_expires;
set resp.http.Set-Cookie
= req.http.X-Varnish-Fake-Session + "; expires="
+ resp.http.X-Varnish-Cookie-Expires + "; path=/";
if (req.http.Host) {
set resp.http.Set-Cookie = resp.http.Set-Cookie + "; domain=" + regsub(req.http.Host, ":d+$", "");
}
set resp.http.Set-Cookie = resp.http.Set-Cookie + "; httponly";
unset resp.http.X-Varnish-Cookie-Expires;
}
if (!client.ip ~ debug) {
unset resp.http.X-Host;
unset resp.http.X-Url;
unset resp.http.X-Varnish-Session;
} else {
if (obj.hits > 0) {
set resp.http.X-Cache = "HIT";
} else {
set resp.http.X-Cache = "MISS";
}
}
!
return (deliver);
47. INVALIDATING CACHED OBJECTS
•
We can control cached objects through http requests to varnish with
some clever VCL-ing
•
PURGE - we can purge a single object from the cache
•
BAN - we can ban a selection of matching objects from the cache
•
REFRESH - we can fetch a new copy of an object whole the old one is
still served in the meantime
48. sub vcl_recv {
if (req.request == "PURGE") {
if (!client.ip ~ purge) {
error 405 "Not allowed.";
}
return(lookup);
}
}
!
sub vcl_hit {
if (req.request == "PURGE") {
purge;
error 200 "Purged";
}
}
!
sub vcl_miss {
if (req.request == "PURGE") {
error 404 "Not in cache";
}
}
53. COMMON PROBLEMS TO OVERCOME
•
My pages are mix of highly dynamic sections and mostly static stuff, and Varnish supposedly only caches whole
pages => Use ESI
•
I need to control/flush/refresh the cache without stoping/starting/killing/rebooting/pulling the cord/assaulting
the datacenter and I prefer to do it from within my app => Set up PURGE/BAN/REFRESH in the VCL
•
My visitors have unique stuff => Use the session cookie in the vcl_hash to keep unique copy
•
Sessions => Use the generate session in Varnish trick
•
Cookies => Uhhh, don't use em?
•
Statistics and tracking visitors => Use the memcached VMOD and process stuff asynch on the backend