3. E3.
E3 = Electronics Entertainment Expo
- Biggest annual video game conference up through 2006
- Every May
- 60,000 - 80,000 industry people
- Publishers spend millions
- Staples Center in LA
- Blogs, magazines, crappy cable TV shows, websites
- The two biggest gaming websites are...
4. IGN, owned by News Corp, who also owns MySpace
Super good at SEO and breaking news
The other one is...
5. GameSpot, owned by CNET
GS gets 1.5m uniques a day, over 15m pageviews
Here’s the obligatory alexa graph...
6. The spikes are E3
Twitter’s catching up!
Okay, so I used to work for GameSpot....
7. funnier if you can see how long my hair is now. and, how not nerdy i am.
8. I was a PHP developer for GameSpot
- Hubs
- Tagging system
- User videos
- User profiles
Last year I convinced my boss to send me to E3...
9. I got sent down because of Rails, but that’s another story
- Booth between Nintendo and Sony.
- Microsoft was in another hall.
- Unveiling of Wii & PS3
Our booth had a studio attached to it ($400k on new eq) and a bunch of computers inside it...
10. - I sat at one of these computers
- The editors would run around writing stories and interviewing people and playing games
- GS had exclusive right to internet streaming of sony and nintendo conferences
- Nintendo.com and Sony.com were pointing at Gamespot for press conference streaming
- Also streamed Microsoft (they streamed on Xbox Live too)
- Live blogging (twittering) from the press conferences for kids at school
- Imagine million of teenage boys constantly hitting refresh...
11. - We used Netscalers
-- switch, firewall, and “accelerator”
- Normally our Netscalers would gzip outgoing requests
- Had to turn o gzip compression because Netscalers’ CPUs were running so hot
- Couldn’t gzip and serve requests fast enough
- 3000 req/sec
- ~70 app servers
- ~15 database servers
- apache2, php4 w/APC
- when the smoke cleared...
14. Memcaching Rails
CHRIS WANSTRATH
ERR FREE
[ http://errfree.com ]
memcaching rails
i will talk about:
- what memcached is
- when and where to use it
- tricks, code
- libraries, tools, and hopefully answer questions
please ask questions whenever
i promise i cant answer them all.
also: i don’t want you guys to think that i think i’m an expert
i’m more of a foot soldier, or a ninja
i’m oficially renaming this session...
15. Memcaching Rails
CHRIS WANSTRATH
ERR FREE
[ http://errfree.com ]
16. chris wanstrath
railsconf 2007
kickin’ ass
with cache-fu
kickin ass with cache-fu
what is memcached...
18. class Memcache Hash
undef :each, :keys
end
a hash that...
you can’t enumerate over
you can’t find all the keys for
you can still GET, SET, and DELETE keys...
19. class Memcache DRbHash
undef :each, :keys
end
distributed
by distributed, we mean you start a daemon on each app server
keys are stored on dierent servers
transparently, quickly
let’s look at a daemon...
20. $ memcached -vv
3 server listening
7 new client connection
7 get app-test:Story:1
7 END
7 set app-test:Story:2 0
7 STORED
7 delete app-test:Story:1
7 DELETED
here is some sample debugging output
single memcache daemon, getting, setting, deleting
truncated but you get the idea
21. $ memcached -vv
3 server listening
7 new client connection
7 get app-test:Story:1
7 END
7 set app-test:Story:2 0
7 STORED
7 delete app-test:Story:1
7 DELETED
developed by...
22. livejournal
alleviate database stress
they were growing too fast, couldnt scale their databases
you can distribute reads but everyone needs to write, which can block
hdd too slow -- avoid sql queries / disk access by caching in RAM
cache anything: generated images, intense number crunching, html, whatever
fast, C, non-blocking IO, O(1) lookups
scales -- drop in a new daemon and youre good to go
it’s also used by...
24. you’ve got a rails site
should you use memcached?
25. YAGNI
ya ain’t gonna need it
none of the big guys built memcache into their infrastructure
(just ask twitter)
build it in later
focus on your app first
in small apps it is slower than sql
hardware is the real special sauce
memcached wont help if you cant keep up with the IO requests
you probably don’t need memcache...
27. UYRDNI
(unless you really do need it)
this would be:
- millions of hits
- millions of rows
- millions of both
if you’re getting these, you can use memcached to help with the heavy lifting...
28. before we get ahead of ourselves, we need to look at the basic pattern...
29. class Presentation ActiveRecord::Base
def self.get_cache(id)
if data = @cache.get(id)
data
else
data = find(id)
@cache.set(id, data)
data
end
end
end
pretend @cache is our memcache object
try to find an id in memcache
if it’s nil, we find it in the database
we set it to the cache
we return
second time we call this method, it returns the cached data
a simpler way to write this...
30. class Presentation ActiveRecord::Base
def self.get_cache(id)
@cache.get(id) ||
@cache.set(id, find(id))
end
end
almost could rewrite it like this
but #get can return false
so, with this basic pattern, we can cache...
31. Fragments
Actions
Sessions
Objects
fragments
- tag cloud
- user info / hcard
actions
- site index
- seldom-changing content
- front door
- everything but the layout
sessions
- if you’re not using the cookie store on edge
objects
- user object
- article object
- avoid sql
what tools can you use...
32. memcache-client
by eric hodel
it’s the good ruby memcache api
bare metal
- configure servers
- instantiate memcache object
- get basic daemon stats
- set / get / delete
- uses marshal to store data
- speeds up object creation on cache hit because unmarshal faster than #new
cake to install...
33. memcache-client
$ gem install memcache-client
gem install
of course, you need memcached running...
35. CachedModel
- from eric hodel
- used for activerecord object caching
- overwrites find()
- caches single objects
- no complex queries
- clears cache on update
- simple and clean
36. Fragment Cache Store
one on rubyforge
patches rails to work with memcache-client
lets you set a time based expiry on fragments
useful because all your mongrels share the same cache
no caching on disk
42. cache_fu
( acts_as_cached 2.0 )
acts as cached 2
rails plugin
used in production on chowhound and chow
other sites too
out of the box, this plugin can handle caching...
43. Fragments
Actions
Sessions
Objects
the things we talked about earlier
- can automatically setup memcache sessions
- can make all fragment caching use memcache
- can do the same for action caching
- can give any ruby object (activerecord) get_cache set_cache and expire_cache
i’m not going to go in depth into the basics
if you want that information...
44. acts_as_cached
i have another (outdated) pdf on my blog with some basic api info on it
also a google group you can join
so first, sessions and fragments...
45. config/memcached.yml
defaults:
ttl: 1800
namespace: railsconf
sessions: false
fragments: false
servers: localhost:11211
here’s a snippet of the yaml config file that cache_fu uses
each environment has its own section, like database.yml, which inherits from this default
ttl is the default time to live, or expiry
namespace is the namespace all the keys live under -- lets you have dierent apps sharing
the same servers
so to turn on memcache as our fragment and session store...
46. config/memcached.yml
defaults:
ttl: 1800
namespace: railsconf
sessions: true
fragments: true
servers: localhost:11211
bam
more realistically you’d have something like...
47. config/memcached.yml
production:
benchmarking: false
sessions: true
fragments: true
servers:
- 192.185.254.121:11211
- 192.185.254.138:11211
- 192.185.254.160:11211
might want to stick with something simple in dev mode
while we’re here, our first tip, learned the hard way...
48. config/memcached.yml
production:
benchmarking: false
sessions: true
fragments: true
servers:
- 192.185.254.121:11211
- 192.185.254.138:11211
- 192.185.254.160:11211
use ip addresses for servers
dns requests can make a noticeable dierence on app performance
what if your internal dns goes down? takes whole site with you
we didnt manage our own dns at cnet
it’s happened to me more than i’d care to admit
so, that’s fragments and sessions. done
what about models...
49. class Presentation ActiveRecord::Base
acts_as_cached
end
our class from before
it’s acting like caching
this adds a bunch of methods, the main two being...
50. get_cache
expire_cache
get_cache will by default go to #find if it’s a miss
it can accept a block which it will use instead on a miss
because of this you probably never need set_cache
expire_cache issues a DELETE to memcached
cachedmodel automatically clears a record’s cache on update or delete
with cache_fu, we ask you to do that explicitly
but it’s easy...
51. class Presentation ActiveRecord::Base
acts_as_cached
after_save :expire_cache
end
expire_cache is also an instance method which uses the object’s id
now if we save a presentation object, its cache will be cleared for us
we did almost all of the caching on chowhound this way
strong cache integrity -- rarely stale data
why not after_update?
you want to clear an object’s cache after create because you may have cached that the object
doesnt exist
like, caching nil for an id that isnt yet created
so we want to do it after save
but this brings us to another gotcha...
52. class Presentation ActiveRecord::Base
def self.get_cache(id)
if data = @cache.get(id)
data
else
data = find(id)
@cache.set(id, data)
data
end
end
end
our get_cache method from before
what happens if we want to cache nil?
how do we express the non-existence of a record?
well, cache_fu checks specifically for nil...
53. class Presentation ActiveRecord::Base
def self.get_cache(id)
if not (data = @cache.get(id)).nil?
data
else
data = find(id)
@cache.set(id, data)
data
end
end
end
so only things that are not nil are returned
what good is this? it lets us cache false
so when you try to cache nil, cache_fu caches false...
54. class Presentation ActiveRecord::Base
def self.get_cache(id)
if not (data = @cache.get(id)).nil?
data
else
data = find(id) || false
@cache.set(id, data)
data
end
end
end
you dont want to keep running a query on a 4 million row table when you know that the record
you’re looking for isnt there
it’s really not any dierent than the record being there
you want to cache information so you dont need to keep expensively looking it up...
56. skinny models
fat controllers
we want to write custom finders instead of cluttering our controller
we can implement many of these custom finders, in our model...
57. with with_scope
in this blog post the example used is find_playing
but the inspiration for this post was some chow code which used find_live that me and evan weaver
did
only show published, public items
not deleted or banned items
so what if you want to always use this scoped find call when caching?
you have two options...
58. class Presentation ActiveRecord::Base
acts_as_cached :conditions = 'published = 1'
end
the acts_as_cached call can take any arbitrary parameter
will pass it through to the find call it uses
so now all our get_cache calls are scoped -- doesnt overwrite the model’s find method
goes against the with_scope custom finder idea...
59. class Presentation ActiveRecord::Base
acts_as_cached :finder = :find_live
end
and now we’re scoped
this isnt a good controller example, though
with_scope is not the majority case
often you just get crazy finds in controllers you want to cache...
60. Topic.find :all,
:conditions =
[quot;created_at ?quot;, 1.week.ago],
:order = 'post_count desc',
:limit = 5
do we want this in our controller?
does our caching code go in our controller?
doesnt seem very skinny...
61. seems fatty
what we’d typically do here is write a custom finder
pre-rolled find method...
62. class Topic ActiveRecord::Base
def self.weekly_popular(limit = 5)
find :all,
:conditions =
[quot;created_at ?quot;, 1.week.ago],
:order = 'post_count desc',
:limit = limit
end
end
write our finder
wraps our custom finder
(you wrote a test, right?)
gives us a skinnier controller...
64. DB: 0.00 (0%)
on chowhound, this is what we aimed for
every second page load should not hit mysql at all
big time cache coverage on the backend
so we’d cache every custom finder or find call
remember that get_cache takes a block
we can write a custom cached finder which wraps our finder method...
65. class Topic ActiveRecord::Base
def self.cached_weekly_popular
get_cache(:weekly_popular) do
weekly_popular
end
end
end
so if it’s cached, we just return it
if not, we run the query and cache it...
67. we dont want to test memcached itself
we disable memcached in tests
your tests shouldnt depend on external resources
to help us out with all this, we’ll use mocha...
68. ruby mocha
mocking and stubbing
you’ve no doubt heard of mocha
one of the best rubygems
it’s in my standard library
i’m also going to write my test bdd-style....
69. bdd test spec
with test/spec
test/spec is a bdd library which wraps test/unit
i like it because it’s clear english
similar to rspec
cache_fu was written bdd with test/spec...
70. A Ruby object acting as cached
- should be able to retrieve a cached version of itself
- should be able to set itself to the cache
- should pass its cached self into a block when supplied
- should be able to expire its cache
- should be able to reset its cache
- should be able to tell if it is cached
- should be able to set itself to the cache with an arbitrary ttl
Finished in 0.028509 seconds.
28 specifications (53 requirements), 0 failures
some of the output!
so our test...
71. context quot;Calling #cached_weekly_popularquot; do
specify quot;should call #weekly_popular if not cachedquot; do
Topic.expects(:fetch_cache).returns(nil)
Topic.cached_weekly_popular.should.equal Topic.weekly_popular
end
specify quot;should return if cachedquot; do
Topic.expects(:get_cache).returns(true)
Topic.expects(:weekly_popular).never
Topic.cached_weekly_popular
end
end
make sure our cached method is calling the method we want
make sure we did our caching right
fetch_cache is the method we want to force into returning nil
set expectations
that’s it, custom finder
of course, this pattern is built into cache_fu...
73. let’s talk about time
how long will the weekly_popular method be cached?
if it’s more than a day, it could start to become inaccurate and stale
we could set a ttl
sometimes you need dierent caches on dierent days
may want story to appear tomorrow but it’s in the db today
how to make sure the cache is cleared when it needs to be?
here’s a simple 80 / 20 solution...
74. def self.cache_key_with_date(id)
date = Date.today.to_s.tr(' ', '_')
cache_key_without_date(id) + ':' + date
end
class self
alias_method_chain :cache_key, :date
end
override the cache_key method
add the date
at midnight, forces a cache miss
insta-publishing
works by day, not by time
this is class wide
if you just want to do it for one key...
75. class Topic ActiveRecord::Base
def self.date_for_key
Date.today.to_s.tr(' ', '_')
end
def self.cached_weekly_popular
key = 'weekly_popular' + date_for_key
get_cache(key) { weekly_popular }
end
end
add a method, or something
remember: this is a collection, so be careful...
76. memcached only stores 1 meg of data per key
‘slabs’
if youre caching associations along with objects, slabs can get big fast
that said, a 200+ post thread on chowhound only takes up about 300k
and those people talk a lot
so check and think before caching anything
let’s say you have a 200 post thread...
77. this one has 233
we’ve got all these users. some users appear more than once.
we dont want to cache users with forum posts, that can give us stale data
would have to clear the cache of every post christine has made whenever she changes her avatar
how do we avoid 233+ memcache calls?
79. Topic.get_cache(1, 2, 3)
so does get_cache
this will utilize memcache-client’s get_multi
grabs all the keys in parallel
cache_fu fills in and caches the blanks for you
when i last looked at the livejournal code, they used get_multi like crazy
avoid hitting the cache -- it can add up to be expensive
so, in other words...
80. user_ids = @topic.posts.map(:user_id).uniq
@users = User.get_cache(user_ids)
in our controller
will give us a hash keyed by the user ids
can reference this in our view
keep things speedy, minimize memcached calls
we only use this in one place on chowhound
another thing we can do to keep our cache speedy is to use a process cache...
81. class ApplicationController
before_filter :local_cache_for_request
end
built into cache_fu
keeps a local hash of memcache’d objects to speed up subsequent access on a single page view
also in cachedmodel
for example...
82. # pulls from memcache
@user = User.get_cache(1)
# pulls from local cache
@user = User.get_cache(1)
in a controller
cleared out at the start of every new request
doesnt carry over
responds to expires and sets, so you wont get stale data within a single request
sometimes, though, you dont want any caching in a request
maybe you think you have stale data and want to see your page straight from the database
maybe you want to re-set all keys on a page...
83. class ApplicationController
before_filter :set_cache_override
def set_cache_override
returning true do
ActsAsCached.skip_cache_gets =
!!params[:skip_cache]
end
end
end
skip_cache_gets tells cache_fu to treat every get as a cache miss
everything will be pulled from its source and re-set to the cache
call it with...
85. what if you do this on the front door of a big site like gamespot during peak hours?
all those expensive queries get re-run
but not just for you
if a query takes 1 second to run, and it’s not in the cache
every request within that 1 second will see the query’s cache as a miss
it will be run N times depending on how many requests you get a second
that can literally kill a big website...
86. but forget about the skip_cache
what if you expire a tag cloud on a homepage
a popular home page, and you’ve got a billion tags
the same thing will happen
every request will see the tag cloud’s cache as a miss while the original request builds the data
this can and has taken down gamespot
they have tagging and some crappy programmers
but i dont want to name names...
87.
88. reset_cache
so, we’ve got this guy called reset_cache
grabs data and sets it to the cache without expiring the key
while this is going on, every request gets the old cached data
new cached data is set
crisis averted
you get promoted
you quit php and start doing rails
another way to do this in cache_fu...
90. class Presentation ActiveRecord::Base
acts_as_cached
after_save :reset_cache
end
can also rock the reset_cache after_save
while we’re back in the model
you run a migration...
91. you add a field to your model
your cached objects in production wont have this field
if you write views or code which access this field, they’ll break
the unmarshaled objects have no idea this field exists
flush the cache?
kind of, but only for that model...
92. class Presentation ActiveRecord::Base
acts_as_cached :version = 1
end
cache_fu supports this version parameter
when i set this to 1, all cached presentation objects will be misses
fresh caches for all of them
if i up it to 2, same thing happens again
just a way to keep your keys consistent and unique
we always forget to do it -- when your site 500s you’ll know
speaking of 500s...
94. monit
monit is a great deployment tool for monitoring multiple daemons
across multiple servers
we’ve found memcached to be pretty damn reliable
sometimes we’d see funk after heavy load and we’d just reboot to get things back to normal
single nodes rarely out
which is good, memcache-client cant recover without re-hashing every key
perl api can recover i’ve heard
but recently, there’s this new kid on the blog...
96. 1
600 200
400
you assign each server a number
let’s say these are my four
basically plots the numbers on a circle...
97. 1
600 cache_get :railsconf 200
400
let’s say i want the key ‘railsconf’
98. 1
:railsconf == 100
600 200
400
consistently give the key a number
find that number’s position in the conceptual circle
99. 1
:railsconf == 200
600 200
400
if that number isnt found, goes to the next highest number which does exist
100. 1
:railsconf == 200
600 200
so we can remove servers and aect only a subset of keys
101. 1
700
:railsconf == 200
600 200
500 300
400
and we can add servers in the same manner
i havent played with it yet but want to soon
in reality i believe each server gets assigned to more than one point on the circle, to help distribute
the cache
102. l33t h4x0rs
• Geoffrey Grosenbach
• Rob Sanheim
• Ryan King
• Lourens Naudé
• Michael Moen
• Corey Donohoe
• PJ Hyett
• Eric Hodel
thanks majorly to these guys for their work and contributions to the plugin or ruby memcache in
general
103. {}
( thanks. any questions? )
thanks everyone
any questions?