Memcached has become a critical tool in the web technology stack. High traffic web sites with dynamic content - like Facebook, Twitter, and Wikipedia - rely on Memcached to scale and ensure “snappy” site performance.
This presentation willl cover a brief overview of Memcached, then dive into the evolution of Memcached’s use in dynamic web sites and how you can scale your site and get better performance with Memcached. We’ll also review emerging architectures and tools of high performance, large scale dynamic websites.
In this webinar you will learn best practices used by some of the hottest sites and get tips on how to avoid potential pitfalls when scaling. Whether you're just building the infrastructure for a brand new site or have a large dynamic site with millions of users, this webinar is for you.
Evolution of a Memcached Deployment Webinar 2010 01 13
1. The Evolution of a Memcached Deployment
Presented by:
Bill Takacs – Director, Product Management
January 2010
2. Agenda
• Rise of the dynamic web
• The web architecture
• Overview of memcached
• The evolution of a dynamic site and Memcached
•G
Gear 6 S l ti
Solution
2 : Copyright 2010 Gear6 Inc.
3. The Web: What’s Changed?
• Population
• Traffic
• Content & Applications
3 : Copyright 2010 Gear6 Inc.
4. Web Growth: Population
• Forrester: 2.2 billion people online globally by 2013
4 : Copyright 2010 Gear6 Inc.
5. Web Growth: Traffic
Cisco: “Annual global IP traffic will
Annual
exceed two-thirds of a zettabyte
(667 exabytes) in four years (2013)
-Cisco Visual Networking Index, 9 June 2009
5 : Copyright 2010 Gear6 Inc.
6. Web Growth: Application & Content
• Social Networking
• Entertainment
Static • Media Dynamic
• Communication
• Community generated content
6 : Copyright 2010 Gear6 Inc.
7. Use of online video sharing double
since 2006
Pew Internet & American Life Project, Online participation in the social media era by Aaron Smith Dec 10th, 2009
http://www.pewinternet.org/Presentations/2009/RTIP-Social-Media.aspx
7 : Copyright 2010 Gear6 Inc.
8. Use of online social networks has 6X
since 2005
Pew Internet & American Life Project, Online participation in the social media era by Aaron Smith Dec 10th, 2009
http://www.pewinternet.org/Presentations/2009/RTIP-Social-Media.aspx
8 : Copyright 2010 Gear6 Inc.
9. Web Architecture
➜ Most sites (over 65%) based on LAMP or JAVA
( )
➜ Industry standard servers replaced proprietary SMP
➜ Shift to Dynamic Content puts strain on origin sites
Web Stack Net
Interface
Storage Clients
PHP, Jav Rails, C,
Data
Storag Interface
Apach Nginx,
Internet
file, bl
Ser
Ser
Perl, Python
Post
Lig
My
Web
W
A
App
he,
ghttpd
ge
tgreSQL
ySQL,
abase
,
rvers
rvers
va,
Proxy
P
lock, FC, SCSI
CDN
Load
e:
Balancer
9 : Copyright 2010 Gear6 Inc.
10. Why Does it Matter?
= $
10 : Copyright 2010 Gear6 Inc.
11. What to do?
Cache
CACHE
CACHE!
11 : Copyright 2010 Gear6 Inc.
12. New Caching Architecture
for Scaling Out
W b Stack
Web S k Net
Interface
Storage Clients
PHP, Java, R
Databa
Storage Interface:
Apache, N
Internet
file, block, FC, SCSI
Serve
Serve
Perl, Pyt
PostgreS
Lighttp
MySQL,
Web
App
Proxy
CDN
ers
Nginx,
ers
p
Rails, C,
b
thon
pd
SQL
ase
Load
Balancer
Cache Servers
memcached
12 : Copyright 2010 Gear6 Inc.
13. Memcached: Pillar of Web 2.0 Architecture
“Everything runs from
Everything
Memory in Web 2.0”
y
» Evan Weaver, Twitter, March 2009
13 : Copyright 2010 Gear6 Inc.
14. The Fix: Memcached
“A high performance, distributed memory object caching
g p , y j g
system, generic in nature, but intended for use in
speeding up dynamic web applications by alleviating
database load”
• Big hash table
• Created by Danga Interactive for Live Journal
• Significantly reduced database load
• Perfect for web sites with high database load
• In use by Facebook Twitter MyYearBook others
Facebook, Twitter, MyYearBook,
14 : Copyright 2010 Gear6 Inc.
15. More on Memcached
• Takes advantage of available DRAM
• Open source
Ope sou ce
• Distributed under BSD license
• Server - Current version is 1.4.3
» http://www.danga.com/memcached/download.bml
• M
Many clients
li t
» http://code.google.com/p/memcached/wiki/Clients
15 : Copyright 2010 Gear6 Inc.
16. Memcached: Best Practices
• Use with MySQL :-) • Careful with
• Use on 64 bit servers numbers of
b f
connections
• Cache “expensive
expensive
operations” • Design to withstand
failures gracefully
• Cache bi
bi-
directionally (R/W) • Evictions
• U consistent
Use i t t • Optimize sizing:
hashing Instances and pools
• Instrumentation
16 : Copyright 2010 Gear6 Inc.
17. What Memcached is NOT:
• A persistent data store
• A database
• Application specific
• A large object cache
• No HA Features
17 : Copyright 2010 Gear6 Inc.
18. Memcache Use Cases
Site Type Repeatable Use
Social Networking Profile caching
Content Aggregation HTML/page caching
Ad Targeting
g g Cookie/profile tracking
p g
Gaming/Entertainment Session caching
Location-based Services DB query scaling
Relationship Session caching
Ecommerce Session & HTML caching
18 : Copyright 2010 Gear6 Inc.
19. Evolution of a Dynamic Site #1
A day in the life of a growing web service
App Server App Server App Server
App Server App Server App Server App Server App Server App Server
write
write … write
read
read read
MySQL MySQL
MySQL
19 : Copyright 2010 Gear6 Inc.
20. Evolution of a Dynamic Site #2
A day in the life of a growing web service
memcached memcached memcached
App Server App Server App Server App Server App Server App Server
memcached memcached memcached
App Server App Server App Server
App Server App Server App Server
write … write write … write
read read read read
MySQL MySQL MySQL MySQL
20 : Copyright 2010 Gear6 Inc.
21. Evolution of a Dynamic Site #3
A day in the life of a growing web service
memcached memcached memcached
App Server App Server App Server App Server App Server App Server
memcached memcached memcached
App Server App Server App Server App Server App Server App Server
memcached memcached … memcached
write … write
read read
write … write
MySQL MySQL read read
MySQL MySQL
21 : Copyright 2010 Gear6 Inc.
23. Tinker: Follow Topic Streams Instead of People
People Events
Inauguration
Mamma Mia!
Political protest
Jazz show
Art show
LOST
LOST
Wine tasting
Fashion Week
New Opera
Nascar
Follow people Follow events
Nascar
Jazz show
Demo conf.
LOST
Inauguration
Nascar
g
Inauguration Inauguration
augu at o
Fashion Week
23 : Copyright 2010 Gear6 Inc.
24. Product Challenges
• Large Data Pipe from multiple sources
• Real Time Analysis and Processing
• Exponential growth
• Intra DB traffic growing exponentially
• Si ifi li i l l
Significant replication lag on slaves
24 : Copyright 2010 Gear6 Inc.
25. Tinker Infrastructure – Prototype
Launched March 2009
Application Servers X2
Glam is traditionally Oracle
house but:
• Leveraged MySQL for Tinker
• Cost
• Features – Replication / Clustering
MySQL:
y
• 1 Master
• 2 Slaves
Configuration • MyISAM
• PHP Front End • Replicated
• MySQL Database
Performance
• Up to 10K users
• Up to 100 queries / second
25 : Copyright 2010 Gear6 Inc.
26. Tinker – A Modest Start
Master-Master Replication
Master Master
• 2 Master DBs replicated
o 1 for Aggregation
o 1 for Trends
• 3 Slaves for application
• InnoDB to prevent locking
Performance
• Doubled users 20K+
• 1K events
• No increase in qps
26 : Copyright 2010 Gear6 Inc.
27. CLUSTERING - Reducing Replication
Created Two DB Clusters
Cluster 1 – Main DB
• 1 Master DB for aggregation
• 3 Slaves for application
• 1 Slave for backup
Cluster 2 – Trends DB
• 1 Master
• 1 Slave
• Replicates 3 Tables from master
Performance
• Reduced traffic to slaves by half
• 50K + users
• 3K + events
27 : Copyright 2010 Gear6 Inc.
28. CACHING - Reducing the Number of Selects
Added Memcached to DB driver
layer
– caching selects
High Availability Memcached
HA Memcached
• Dedicated hardware
• Centralized memcached
• 10GB Cache
• 2 Servers – clustered
• Failover, replication, high a ailabilit
Failo er replication availability
• Easy to manage and maintain
Performance
• Reduced the load on slaves by 80%
28 : Copyright 2010 Gear6 Inc.
29. CACHING – Optimization and Tuning
Smart Caching Strategy
• Based on timeliness of the data
o 1 min to 1 hour
• Invalidate cached based on user activity
o Creating events
o Following events
o Aggregating post
Improved Queries
I dQ i
• Created appropriate indexes
• Reduced number of queries needed to load a page
i tl di <5 d bi d
• All DB queries on a page must load in <.5 seconds combined
• Eliminate count queries where appropriate
• The largest table Feed_Item was partitioned by date range
o Reduce table / row locks
Performance
g pp
• Load average on Slave DBs dropped from >20 to <1
• Page loads that were 20 – 30 sec now loaded in <1
29 : Copyright 2010 Gear6 Inc.
30. Generation - Next
Scaling to 10X and 100X of current DB
transactions
Larger Memcached Deployment
• Caching strategy
strateg
• Large scale Widget deployment
• MySQL based Memcached invalidation strategy
Failover – clusters in multiple colos with
replication
DB Sharding
• Balancing load based on event activity
• Event migration from one cluster to another
30 : Copyright 2010 Gear6 Inc.
31. Useful Memcached Tools
advanced reporter
p
Track hot keys and clients in Memcached
wireshark
Dissect and analyze Memcached network traffic
brutis
Size and test changes to memcache clusters
statsproxy
View buffered Memcached stats in your browser
cacti
Graph and analyze Memcached statistics
31 : Copyright 2010 Gear6 Inc.
32. Statsproxy + Cacti Templates
To use the cacti
templates for
memcached with
statsproxy, you
either need to
modify the
templates to use
port 8080 or
change the
statsproxy config to
use port 11211
32 : Copyright 2010 Gear6 Inc.
33. About Gear6
• First and leading provider
of Memcached solutions
• Memcached solution including
• High density
• High Availability
• Advanced memory
management
• Enhanced reporting
capabilities
biliti
• Support for multi-tenancy
• Disruption free
software upgrades
• 100% c e t co pat b e
00% client compatible
33 : Copyright 2010 Gear6 Inc.
34. Gear6 Products
Web Cache – Universal Distribution:
•Software
•Hardware
W b C h Server – Cl d
Web Cache S Cloud: Server
34 : Copyright 2010 Gear6 Inc.
35. Questions?
Thank you for attending
“The Evolution of a Memcached Deployment”
by Gear6
Bill Takacs
Gear6
salesinfo@gear6.com
+1 650 587 7118
35 : Copyright 2010 Gear6 Inc.
36. References
• Danga.com
• g y
Highscalability.com
• Dev.gear6.com
• Groups.google.com/group/memcached
• g g p
Code.google.com/p/memcached
• Twitter.com/gearsix
• Cacti.net
• Wireshark.org
• http://dev.mysql.com/doc/mysql-ha-scalability/en/ha-memcached-using-
deployment.html
• http://dev.mysql.com/doc/mysql-ha-scalability/en/ha-memcached-using-
p y q y q y g
hashtypes.html
• http://jayant7k.blogspot.com/2009/04/memcached-replication.html
• http://www.lexemetech.com/2007/11/consistent-hashing.html
• http://www8.org/w8-papers/2a-
webserver/caching/paper2.htmlhttp://www.last.fm/user/RJ/journal/2007/04/10/rz_l
ibketama_-_a_consistent_hashing_algo_for_memcache_clients
• http://bazaar.launchpad.net/~libmemcached-
developers/libmemcached/trunk/revision/539
36 : Copyright 2010 Gear6 Inc.