Memcached has become the de facto standard for caching web applications. But, many users jump in feet first without understanding what it does or perhaps more importantly what it does not do. Once you understand memcached, you may come to realize that it is what it does not do that makes it so good.
Memcached is a distributed memory based caching system. But, what does that mean for you? This session will cover the basics of memcached. What are all the components needed? Where is your data cached? What happens when there is a system failure? Is my data stored in more than one place? How do I know what is in my cache? All these questions and more will be answered.
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Memcached: What is it and what does it do?
1. MEMCACHED: WHAT IS IT
AND WHAT DOES IT DO?
Brian Moon
dealnews.com
http://brian.moonspot.net/
2. @BRIANLMOON
• Senior Web Engineer for
dealnews.com
• Founder and lead developer of
Phorum
• Memcached community member
• Gearmand contributor
• PHP internals contributor
• I used PHP/FI
4. WHAT IS A CACHE?
"1 a: a hiding place especially for
concealing and preserving provisions or
implements b: a secure place of storage"
http://www.merriam-webster.com/dictionary/cache
6. WHAT IS A CACHE?
"...a component that improves performance by
transparently storing data such that future requests for
that data can be served faster. The data that is stored
within a cache might be values that have been
computed earlier or duplicates of original values that
are stored elsewhere. If requested data is contained in
the cache (cache hit), this request can be served by
simply reading the cache, which is comparably faster.
Otherwise (cache miss), the data has to be
recomputed or fetched from its original storage
location, which is comparably slower."
http://en.wikipedia.org/wiki/Cache
7. WHAT IS A CACHE?
"...a component that improves performance by
transparently storing data such that future requests for
that data can be served faster. The data that is stored
within a cache might be values that have been
computed earlier or duplicates of original values that
are stored elsewhere. If requested data is contained in
the cache (cache hit), this request can be served by
simply reading the cache, which is comparably faster.
Otherwise (cache miss), the data has to be
recomputed or fetched from its original storage
location, which is comparably slower."
http://en.wikipedia.org/wiki/Cache
8. WHAT IS MEMCACHED?
memcached is a high-performance, distributed
memory object caching system, generic in nature, but
intended for use in speeding up dynamic web
applications by alleviating database load.
• Dumb daemon
• It is a generic key/data storage system
• Uses libevent and epoll/kqueue
• Caches data in memory
• Cache is distributed by the smart clients
12. WHERE IS MY DATA?
• The client (not server) uses a hashing algorithm to
determine the storage server
• Data is sent to only one server
• Servers do not share data
• Data is not replicated
• Two hashing algorithms possible:
• Traditional
• “Consistent”
13. WHERE IS MY DATA?
Traditional
server = servers[hash(key) % servers.length]
(eenie meenie miney moe)
14. WHERE IS MY DATA?
“Consistent”
Each server is allocated
LOTS of numbers on a
“wheel”. The key is
hashed to a number in
that range and the
server assigned the
closest number is used.
Adding/removing
servers from the list http://www.flickr.com/photos/k-bot/2614389196/
results in less key
reassignment.
15. What can I store?
How big can it be?
http://www.flickr.com/photos/hshap/469025786/
16. WHAT CAN I STORE?
• Server stores blobs of binary data
• Most clients will serialize non-string data
• Keys are limited to 250 bytes in length
• Keys can not contain spaces or “high” characters. Stick
with letters, numbers, _ and you are pretty safe.
• Some clients may normalize keys for you. But, don’t
count on it.
17. DATA SIZE MATTERS
• Maximum size for one item is 1MB (until recently)
• Some clients support compression
• Data is stored in slabs based on size
• Lots of items of the same size is not optimal
• Slab size can be customized
• May not be able to store items when it appears
there is “free” memory
• Data can be evicted sooner than expected.
18. &E
vict
ion
s
http://www.flickr.com/photos/aussiegall/322980012/
19. EVICTION AND EXPIRATION
• Expiration time can be expressed
as seconds from now or as an
absolute epoch time.
• Items are not removed from
memory when they expire
• Items are evicted when newer
items need to be stored
• Least Recently Used (LRU)
determines what is evicted
• Eviction is done per slab http://www.flickr.com/photos/bitchcakes/4410181958/
20. How do I know it is working?
http://www.flickr.com/photos/carolinadoug/3932117107/
22. HOW WELL IS IT WORKING?
STAT uptime 9207843
STAT cmd_get 66421687
STAT cmd_set 10640419
STAT get_hits 66421687 84%
STAT get_misses 12360549 hit rate
STAT evictions 0
23. HOW WELL IS IT WORKING?
• Graph stats from memcached using Cacti/Ganglia, etc.
• Key stats:
• Hits/Misses
• Gets/Sets
• Evictions
• Cacti Templates:
• http://dealnews.com/developers/
• http://code.google.com/p/mysql-cacti-templates/
24. There are some things
you think you want to
do, but you can’t do
them and/or shouldn’t
do them.
http://www.flickr.com/photos/magdalar/4241254141/
25. HOW DO I SEE THE CACHE?
• You have no way to see the cached data.
• You probably don’t need to see it.
• For memcached to tell you, it would freeze your entire
caching system
• There are debug ways to see.
• DO NOT COMPILE PRODUCTION WITH DEBUG
BECAUSE YOU ARE A CONTROL FREAK!
26. HOW DO I BACK IT UP?
YOU DON’T!!!
• If you application requires that, you are using it wrong
• It is a cache, not a data storage system
• Maybe try Tokyo Tyrant, MongoDB or another
“NOSQL” key/data store
27. NAMESPACES & TAGGING
• There is no concept of namespaces or tagging built in
to memcached
• You can simulate them with an extra key storage
• See the FAQ for an example of simulated namespaces
• This of course means there is no mass delete in
memcached
• There have been patches, but they never performed
well.
28. MORE THINGS NOT TO DO
• Use memcached as a locking daemon
• Use memcached to store data that can’t go away
• Don’t use it to try and speed up your intranet
• Store complex data types that the clients have to
serialize or unserialize
• Complain on the mailing list that you can’t do any of the
things listed above. =)