SlideShare uma empresa Scribd logo
1 de 61
Baixar para ler offline
memcache@facebook


          Marc Kwiatkowski
          memcache tech lead
          QCon




Monday, April 12, 2010
How big is facebook?




Monday, April 12, 2010
million active users




Monday, April 12, 2010


400M Active
million active users
                                        Title

            M
                                                M
            M



            M



            M



            M



            M




Monday, April 12, 2010


400M Active
Objects
            ▪   More than          million status updates posted each day
                ▪         /s
            ▪   More than billion photos uploaded to the site each month
                ▪        /s
            ▪   More than billion pieces of content (web links, news stories,
                blog posts, notes, photo albums, etc.) shared each week
                ▪    K/s
            ▪   Average user has           friends on the site
                ▪        Billion friend graph edges
            ▪   Average user clicks the Like button on pieces of content each
                month

Monday, April 12, 2010
- Infrastructure
         ▪   Thousands of servers in several data centers in two regions
             ▪   Web servers
             ▪   DB servers
             ▪   Memcache Servers
             ▪   Other services




Monday, April 12, 2010
The scale of memcache @ facebook
         ▪   Memcache Ops/s
             ▪   over      M gets/sec
             ▪   over    M sets/sec
             ▪   over T cached items
             ▪   over      Tbytes
         ▪   Network IO
             ▪   peak rx      Mpkts/s   GB/s
             ▪   peak tx     Mpkts/s    GB/s




Monday, April 12, 2010
A typical memcache server’s P.O.V.
         ▪   Network I/O
             ▪   rx      Kpkts/s . MB/s
             ▪   tx      Kpkts/s      MB/s
         ▪   Memcache OPS
             ▪        K gets/s
             ▪    K sets/s
         ▪          M items




Monday, April 12, 2010
All rates are 1 day moving averages
Evolution of facebook’s
          architecture




Monday, April 12, 2010
Monday, April 12, 2010

• When Mark Zuckerberg and his roommates started Facebook in a Harvard dorm in 2004, the put everyone
  on one server
• Then as Facebook grew, they could scale like a traditional site by just adding servers
• Even as the site grew beyond Harvard to Stanford, Columbia and thousands of other campuses, each was a
  separated network that could be served on an isolated set of servers
• But as people connected more between schools connected, the model changed--and the big change came
  when Facebook opened to everyone in Sept. 2006
• [For globe]: That led to people being connected everywhere around the world--not just on a single college
  campus.
• [For globe]: This visualization shows accepted friend requests animating from requesting friend to
  accepting friend
Monday, April 12, 2010

• When Mark Zuckerberg and his roommates started Facebook in a Harvard dorm in 2004, the put everyone
  on one server
• Then as Facebook grew, they could scale like a traditional site by just adding servers
• Even as the site grew beyond Harvard to Stanford, Columbia and thousands of other campuses, each was a
  separated network that could be served on an isolated set of servers
• But as people connected more between schools connected, the model changed--and the big change came
  when Facebook opened to everyone in Sept. 2006
• [For globe]: That led to people being connected everywhere around the world--not just on a single college
  campus.
• [For globe]: This visualization shows accepted friend requests animating from requesting friend to
  accepting friend
Monday, April 12, 2010

• When Mark Zuckerberg and his roommates started Facebook in a Harvard dorm in 2004, the put everyone
  on one server
• Then as Facebook grew, they could scale like a traditional site by just adding servers
• Even as the site grew beyond Harvard to Stanford, Columbia and thousands of other campuses, each was a
  separated network that could be served on an isolated set of servers
• But as people connected more between schools connected, the model changed--and the big change came
  when Facebook opened to everyone in Sept. 2006
• [For globe]: That led to people being connected everywhere around the world--not just on a single college
  campus.
• [For globe]: This visualization shows accepted friend requests animating from requesting friend to
  accepting friend
Monday, April 12, 2010

• When Mark Zuckerberg and his roommates started Facebook in a Harvard dorm in 2004, the put everyone
  on one server
• Then as Facebook grew, they could scale like a traditional site by just adding servers
• Even as the site grew beyond Harvard to Stanford, Columbia and thousands of other campuses, each was a
  separated network that could be served on an isolated set of servers
• But as people connected more between schools connected, the model changed--and the big change came
  when Facebook opened to everyone in Sept. 2006
• [For globe]: That led to people being connected everywhere around the world--not just on a single college
  campus.
• [For globe]: This visualization shows accepted friend requests animating from requesting friend to
  accepting friend
Monday, April 12, 2010

• When Mark Zuckerberg and his roommates started Facebook in a Harvard dorm in 2004, the put everyone
  on one server
• Then as Facebook grew, they could scale like a traditional site by just adding servers
• Even as the site grew beyond Harvard to Stanford, Columbia and thousands of other campuses, each was a
  separated network that could be served on an isolated set of servers
• But as people connected more between schools connected, the model changed--and the big change came
  when Facebook opened to everyone in Sept. 2006
• [For globe]: That led to people being connected everywhere around the world--not just on a single college
  campus.
• [For globe]: This visualization shows accepted friend requests animating from requesting friend to
  accepting friend
Scaling Facebook: Interconnected data


                              Bob




Monday, April 12, 2010


•On Facebook, the data required to serve your home page or
 any other page s incredibly interconnected
•Your data can’t sit on one server or cluster of servers because
 almost every piece of content on Facebook requires
 information about your network of friends
•And the average user has 130 friends
•As we scale, we have to be able to quickly pull data across all
 of our servers, wherever it’s stored.
Scaling Facebook: Interconnected data


                           Bob              Brian




Monday, April 12, 2010


•On Facebook, the data required to serve your
 home page or any other page s incredibly
 interconnected
•Your data can’t sit on one server or cluster of
 servers because almost every piece of content on
 Facebook requires information about your
 network of friends
•And the average user has 130 friends
•As we scale, we have to be able to quickly pull
 data across all of our servers, wherever it’s stored.
Scaling Facebook: Interconnected data


                    Felicia   Bob           Brian




Monday, April 12, 2010


•On Facebook, the data required to serve your
 home page or any other page s incredibly
 interconnected
•Your data can’t sit on one server or cluster of
 servers because almost every piece of content on
 Facebook requires information about your
 network of friends
•And the average user has 130 friends
•As we scale, we have to be able to quickly pull
 data across all of our servers, wherever it’s stored.
Memcache Rules of the Game
         ▪   GET object from memcache
             ▪   on miss, query database and SET object to memcache
         ▪   Update database row and DELETE object in memcache
         ▪   No derived objects in memcache
             ▪   Every memcache object maps to persisted data in database




Monday, April 12, 2010
Scaling memcache




Monday, April 12, 2010
Phatty Phatty Multiget




Monday, April 12, 2010
Phatty Phatty Multiget




Monday, April 12, 2010
Phatty Phatty Multiget (notes)
         ▪   PHP runtime is single threaded and synchronous
         ▪   To get good performance for data-parallel operations like
             retrieving info for all friends, it’s necessary to dispatch memcache
             get requests in parallel
         ▪   Initially we just used polling I/O in PHP.
         ▪   Later we switched to true asynchronous I/O in a PHP C extension
         ▪   In both case the result was reduced latency through parallelism.




Monday, April 12, 2010
Pools and Threads




                         PHP Client




Monday, April 12, 2010
sp:         cs:
                               sp:
                                                           cs:       cs:
                         sp:



                                     PHP Client




Monday, April 12, 2010

Different objects have different sizes and access patterns. We began creating
memcache pools to segregate different kinds of objects for better cache efficiency and
memory utilization.
sp:   sp:   sp:          cs:      cs:       cs:




                                     PHP Client




Monday, April 12, 2010

Different objects have different sizes and access patterns. We began creating
memcache pools to segregate different kinds of objects for better cache efficiency and
memory utilization.
PHP Client




Monday, April 12, 2010

Different objects have different sizes and access patterns. We began creating
memcache pools to segregate different kinds of objects for better cache efficiency and
memory utilization.
Pools and Threads (notes)
         ▪   Privacy objects are small but have poor hit rates
         ▪   User-profiles are large but have good hit rates
         ▪   We achieve better overall caching by segregating different classes
             of objects into different pools of memcache servers
         ▪   Memcache was originally a classic single-threaded unix daemon
             ▪   This meant we needed to run instances with / the RAM on
                 each memcache server
             ▪    X the number of connections to each both
             ▪    X the meta-data overhead
             ▪   We needed a multi-threaded service


Monday, April 12, 2010
Connections and Congestion
       ▪   [animation]




Monday, April 12, 2010
Connections and Congestion (notes)
         ▪   As we added web-servers the connections to each memcache box
             grew.
             ▪   Each webserver ran   -   PHP processes
             ▪   Each memcache box has       K+ TCP connections
             ▪   UDP could reduce the number of connections
         ▪   As we added users and features, the number of keys per-multiget
             increased
             ▪   Popular people and groups
             ▪   Platform and FBML
         ▪   We began to see incast congestion on our ToR switches.


Monday, April 12, 2010
Serialization and Compression
         ▪   We noticed our short profiles weren’t so short
             ▪       K PHP serialized object
             ▪   fb-serialization
                 ▪   based on thrift wire format
                 ▪    X faster
                 ▪       smaller
             ▪   gzcompress serialized strings




Monday, April 12, 2010
Multiple Datacenters

                          SC Web




                            SC
                         Memcache




                                   SC MySQL




Monday, April 12, 2010
Multiple Datacenters

                          SC Web              SF Web




                            SC               SF
                         Memcache         Memcache




                                   SC MySQL




Monday, April 12, 2010
Multiple Datacenters

                           SC Web              SF Web

                         Memcache Proxy    Memcache Proxy




                            SC                SF
                         Memcache          Memcache




                                    SC MySQL




Monday, April 12, 2010
▪Multiple Datacenters (notes)

         ▪   In the early days we had two data-centers
             ▪   The one we were about to turn off
             ▪   The one we were about to turn on
         ▪   Eventually we outgrew a single data-center
             ▪   Still only one master database tier
             ▪   Rules of the game require that after an update we need to
                 broadcast deletes to all tiers
             ▪   The mcproxy era begins




Monday, April 12, 2010
Multiple Regions
                              West Coast           East Coast
                          SC Web                       VA Web




                            SC                          VA
                         Memcache                    Memcache




                                              Memcache Proxy


                                   SC MySQL        VA MySQL




Monday, April 12, 2010
Multiple Regions
                                 West Coast                      East Coast
                           SC Web              SF Web                VA Web

                         Memcache Proxy    Memcache Proxy




                            SC                SF                      VA
                         Memcache          Memcache                Memcache




                                                            Memcache Proxy


                                    SC MySQL                     VA MySQL




Monday, April 12, 2010
Multiple Regions
                                 West Coast                                      East Coast
                           SC Web              SF Web                                VA Web

                         Memcache Proxy    Memcache Proxy




                            SC                SF                                      VA
                         Memcache          Memcache                                Memcache




                                                                            Memcache Proxy


                                    SC MySQL            MySql replication        VA MySQL




Monday, April 12, 2010
▪   Multiple Regions (notes)

         ▪   Latency to east coast and European users was/is terrible.
         ▪   So we deployed a slave DB tier in Ashburn VA
             ▪   Slave DB tracks syncs with master via MySQL binlog
         ▪   This introduces a race condition
         ▪   mcproxy to the rescue again
             ▪   Add memcache delete pramga to MySQL update and insert ops
             ▪   Added thread to slave mysqld to dispatch deletes in east coast
                 via mcpro




Monday, April 12, 2010
Replicated Keys

                  Memcache     Memcache     Memcache




                  PHP Client   PHP Client   PHP Client

Monday, April 12, 2010
Replicated Keys

                  Memcache     Memcache     Memcache

                                 key




                  PHP Client   PHP Client   PHP Client

Monday, April 12, 2010
Replicated Keys

                  Memcache      Memcache     Memcache

                               key key key




                  PHP Client    PHP Client   PHP Client

Monday, April 12, 2010
Replicated Keys

                  Memcache      Memcache     Memcache

                               key key key




                  PHP Client    PHP Client   PHP Client

Monday, April 12, 2010
Replicated Keys

                  Memcache     Memcache     Memcache




                                 key
                  PHP Client   PHP Client   PHP Client

Monday, April 12, 2010
Replicated Keys

                  Memcache      Memcache     Memcache

                         key#    key#         key#




                                  key
                  PHP Client    PHP Client   PHP Client

Monday, April 12, 2010
Replicated Keys (notes)
         ▪   Viral groups and applications cause hot keys
         ▪   More gets than a single memcache server can process
             ▪   (Remember the rules of the game!)
             ▪   That means more queries than a single DB server can process
             ▪   That means that group or application is effectively down
         ▪   Creating key aliases allows us to add server capacity.
             ▪   Hot keys are published to all web-servers
             ▪   Each web-server picks an alias for gets
                 ▪   get key:xxx => get key:xxx#N
             ▪   Each web-server deletes all aliases


Monday, April 12, 2010
Memcache Rules of the Game
         ▪   New Rule
             ▪   If a key is hot, pick an alias and fetch that for reads
             ▪   Delete all aliases on updates




Monday, April 12, 2010
Mirrored Pools


                Specialized Replica                   Specialized Replica
                         Shard       Shard                 Shard        Shard




                                   General pool with wide fanout
                           Shard        Shard      Shard            Shard n

                                                              ...


Monday, April 12, 2010
Mirrored Pools (notes)
            ▪   As our memcache tier grows the ratio of keys/packet decreases
                ▪        keys/ server = packet
                ▪        keys/   server =   packets
                ▪   More network traffic
                ▪   More memcache server kernel interrupts per request
            ▪   Confirmed Info - critical account meta-data
                ▪   Have you confirmed your account?
                ▪   Are you a minor?
                ▪   Pulled from large user-profile objects
            ▪   Since we just need a few bytes of data for many users

Monday, April 12, 2010
Hot Misses
         ▪   [animation]




Monday, April 12, 2010
Hot Misses (notes)
         ▪   Remember the rules of the game
             ▪   update and delete
             ▪   miss, query, and set
         ▪   When the object is very, very popular, that query rate can kill a
             database server
         ▪   We need flow control!




Monday, April 12, 2010
Memcache Rules of the Game
            ▪   For hot keys, on miss grab a mutex before issuing db query
                ▪   memcache-add a per-object mutex
                    ▪    key:xxx => key:xxx#mutex
                    ▪    If add succeeds do the query
                    ▪    If add fails (because mutex already exists) back-off and try again
                    ▪    After set delete mutex




Monday, April 12, 2010
Hot Deletes
            ▪   [hot groups graphics]




Monday, April 12, 2010
Hot Deletes (notes)
         ▪   We’re not out of the woods yet
         ▪   Cache mutex doesn’t work for frequently updated objects
             ▪   like membership lists and walls for viral groups and applications.
         ▪   Each process that acquires a mutex finds that the object has been
             deleted again
             ▪   ...and again
             ▪   ...and again




Monday, April 12, 2010
Rules of the Game: Caching Intent
         ▪   Each memcache server is in the perfect position to detect and
             mitigate contention
             ▪   Record misses
             ▪   Record deletes
             ▪   Serve stale data
             ▪   Serve lease-ids
             ▪   Don’t allow updates without a valid lease id




Monday, April 12, 2010
Next Steps




Monday, April 12, 2010
Shaping Memcache Traffic
         ▪   mcproxy as router
             ▪   admission control
             ▪   tunneling inter-datacenter traffic




Monday, April 12, 2010
Cache Hierarchies
         ▪   Warming up Cold Clusters
         ▪   Proxies for Cacheless Clusters




Monday, April 12, 2010
Big Low Latency Clusters
         ▪   Bigger Clusters are Better
         ▪   Low Latency is Better
         ▪   L .
         ▪   UDP
         ▪   Proxy Facebook Architecture




Monday, April 12, 2010
Worse IS better
         ▪   Richard Gabriel’s famous essay contrasted
             ▪   ITS and Unix
             ▪   LISP and C
             ▪   MIT and New Jersey




Monday, April 12, 2010
http://www.jwz.org/doc/worse-is-better.html
Why Memcache Works
         ▪   Uniform, low latency with partial results is a better user
             experience
         ▪   memcache provides a few robust primitives
             ▪   key-to-server mapping
             ▪   parallel I/O
             ▪   flow-control
             ▪   traffic shaping
         ▪   that allow ad hoc solutions to a wide range of scaling issues




Monday, April 12, 2010
We started with simple, obvious improvements.
As we grew we deployed less obvious improvements...
But they’ve remained pretty simple
(c)   Facebook, Inc. or its licensors.  "Facebook" is a registered trademark of Facebook, Inc.. All rights reserved. .




Monday, April 12, 2010

Mais conteúdo relacionado

Destaque (6)

1ªaula história educ.brasil
1ªaula história educ.brasil1ªaula história educ.brasil
1ªaula história educ.brasil
 
História da educação
História da educaçãoHistória da educação
História da educação
 
2.introdução à didática
2.introdução à didática2.introdução à didática
2.introdução à didática
 
A educação nas antigas civilizações.
A educação nas antigas civilizações.A educação nas antigas civilizações.
A educação nas antigas civilizações.
 
Modelo Apresentação SLide PMKT
Modelo Apresentação SLide PMKT Modelo Apresentação SLide PMKT
Modelo Apresentação SLide PMKT
 
2 Antiguidade Oriental
2 Antiguidade Oriental2 Antiguidade Oriental
2 Antiguidade Oriental
 

Semelhante a Marc facebook

Global lodlam_communities and open cultural data
Global lodlam_communities and open cultural dataGlobal lodlam_communities and open cultural data
Global lodlam_communities and open cultural data
Minerva Lin
 
Selenium at Mozilla: An Essential Element to our Success
Selenium at Mozilla: An Essential Element to our SuccessSelenium at Mozilla: An Essential Element to our Success
Selenium at Mozilla: An Essential Element to our Success
Stephen Donner
 

Semelhante a Marc facebook (20)

NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
 
Consuming Linked Data SemTech2010
Consuming Linked Data SemTech2010Consuming Linked Data SemTech2010
Consuming Linked Data SemTech2010
 
The Reluctant SysAdmin : 360|iDev Austin 2010
The Reluctant SysAdmin : 360|iDev Austin 2010The Reluctant SysAdmin : 360|iDev Austin 2010
The Reluctant SysAdmin : 360|iDev Austin 2010
 
Metadata / Linked Data
Metadata / Linked DataMetadata / Linked Data
Metadata / Linked Data
 
Library mashups: Exploring new ways to deliver library data
Library mashups: Exploring new ways to deliver library dataLibrary mashups: Exploring new ways to deliver library data
Library mashups: Exploring new ways to deliver library data
 
Welcome to Consuming Linked Data tutorial WWW2010
Welcome to Consuming Linked Data tutorial WWW2010Welcome to Consuming Linked Data tutorial WWW2010
Welcome to Consuming Linked Data tutorial WWW2010
 
NISO Webinar: Library Linked Data: From Vision to Reality
NISO Webinar: Library Linked Data: From Vision to RealityNISO Webinar: Library Linked Data: From Vision to Reality
NISO Webinar: Library Linked Data: From Vision to Reality
 
Matías Paterlini: Desarrollo de aplicaciones en Facebook
Matías Paterlini: Desarrollo de aplicaciones en FacebookMatías Paterlini: Desarrollo de aplicaciones en Facebook
Matías Paterlini: Desarrollo de aplicaciones en Facebook
 
Web1
Web1Web1
Web1
 
OpenStack and Databases
OpenStack and DatabasesOpenStack and Databases
OpenStack and Databases
 
Cosi Usage Data
Cosi   Usage DataCosi   Usage Data
Cosi Usage Data
 
Defying Domains Draft Presentation
Defying Domains Draft PresentationDefying Domains Draft Presentation
Defying Domains Draft Presentation
 
Global lodlam_communities and open cultural data
Global lodlam_communities and open cultural dataGlobal lodlam_communities and open cultural data
Global lodlam_communities and open cultural data
 
Selenium at Mozilla: An Essential Element to our Success
Selenium at Mozilla: An Essential Element to our SuccessSelenium at Mozilla: An Essential Element to our Success
Selenium at Mozilla: An Essential Element to our Success
 
Howison rutgers-open superposition
Howison rutgers-open superpositionHowison rutgers-open superposition
Howison rutgers-open superposition
 
Massively Open Online Courses - Beyond the Hype
Massively Open Online Courses - Beyond the HypeMassively Open Online Courses - Beyond the Hype
Massively Open Online Courses - Beyond the Hype
 
Caelum dicas web 2010
Caelum dicas web 2010Caelum dicas web 2010
Caelum dicas web 2010
 
20100513brown
20100513brown20100513brown
20100513brown
 
Social web Ontologies
Social web OntologiesSocial web Ontologies
Social web Ontologies
 
What is the Semantic Web
What is the Semantic WebWhat is the Semantic Web
What is the Semantic Web
 

Mais de d0nn9n

腾讯大讲堂:62 拇指下的精彩(手机qq交互设计经验分享)
腾讯大讲堂:62 拇指下的精彩(手机qq交互设计经验分享)腾讯大讲堂:62 拇指下的精彩(手机qq交互设计经验分享)
腾讯大讲堂:62 拇指下的精彩(手机qq交互设计经验分享)
d0nn9n
 
腾讯大讲堂:55 企业法律风险防范
腾讯大讲堂:55 企业法律风险防范腾讯大讲堂:55 企业法律风险防范
腾讯大讲堂:55 企业法律风险防范
d0nn9n
 
腾讯大讲堂:56 qzone安全之路
腾讯大讲堂:56 qzone安全之路腾讯大讲堂:56 qzone安全之路
腾讯大讲堂:56 qzone安全之路
d0nn9n
 
腾讯大讲堂:59 数据蕴含商机,挖掘决胜千里
腾讯大讲堂:59 数据蕴含商机,挖掘决胜千里腾讯大讲堂:59 数据蕴含商机,挖掘决胜千里
腾讯大讲堂:59 数据蕴含商机,挖掘决胜千里
d0nn9n
 
腾讯大讲堂:57 超级qq的千万之路
腾讯大讲堂:57 超级qq的千万之路 腾讯大讲堂:57 超级qq的千万之路
腾讯大讲堂:57 超级qq的千万之路
d0nn9n
 
蔡学镛 Rebol漫谈
蔡学镛   Rebol漫谈蔡学镛   Rebol漫谈
蔡学镛 Rebol漫谈
d0nn9n
 
赵泽欣 - 淘宝网前端应用与发展
赵泽欣 - 淘宝网前端应用与发展赵泽欣 - 淘宝网前端应用与发展
赵泽欣 - 淘宝网前端应用与发展
d0nn9n
 
Yanggang wps
Yanggang wpsYanggang wps
Yanggang wps
d0nn9n
 
熊节 - 软件工厂的精益之路
熊节 - 软件工厂的精益之路熊节 - 软件工厂的精益之路
熊节 - 软件工厂的精益之路
d0nn9n
 
谢恩伟 - 微软在云端
谢恩伟 - 微软在云端谢恩伟 - 微软在云端
谢恩伟 - 微软在云端
d0nn9n
 
去哪儿平台技术
去哪儿平台技术去哪儿平台技术
去哪儿平台技术
d0nn9n
 
吴磊 - Silverlight企业级RIA
吴磊 - Silverlight企业级RIA吴磊 - Silverlight企业级RIA
吴磊 - Silverlight企业级RIA
d0nn9n
 
Tom - Scrum
Tom - ScrumTom - Scrum
Tom - Scrum
d0nn9n
 
Tim - FSharp
Tim - FSharpTim - FSharp
Tim - FSharp
d0nn9n
 
Tiger oracle
Tiger oracleTiger oracle
Tiger oracle
d0nn9n
 
Paulking groovy
Paulking groovyPaulking groovy
Paulking groovy
d0nn9n
 
Paulking dlp
Paulking dlpPaulking dlp
Paulking dlp
d0nn9n
 
Patrick jcp
Patrick jcpPatrick jcp
Patrick jcp
d0nn9n
 
Nick twitter
Nick twitterNick twitter
Nick twitter
d0nn9n
 
Kane debt
Kane debtKane debt
Kane debt
d0nn9n
 

Mais de d0nn9n (20)

腾讯大讲堂:62 拇指下的精彩(手机qq交互设计经验分享)
腾讯大讲堂:62 拇指下的精彩(手机qq交互设计经验分享)腾讯大讲堂:62 拇指下的精彩(手机qq交互设计经验分享)
腾讯大讲堂:62 拇指下的精彩(手机qq交互设计经验分享)
 
腾讯大讲堂:55 企业法律风险防范
腾讯大讲堂:55 企业法律风险防范腾讯大讲堂:55 企业法律风险防范
腾讯大讲堂:55 企业法律风险防范
 
腾讯大讲堂:56 qzone安全之路
腾讯大讲堂:56 qzone安全之路腾讯大讲堂:56 qzone安全之路
腾讯大讲堂:56 qzone安全之路
 
腾讯大讲堂:59 数据蕴含商机,挖掘决胜千里
腾讯大讲堂:59 数据蕴含商机,挖掘决胜千里腾讯大讲堂:59 数据蕴含商机,挖掘决胜千里
腾讯大讲堂:59 数据蕴含商机,挖掘决胜千里
 
腾讯大讲堂:57 超级qq的千万之路
腾讯大讲堂:57 超级qq的千万之路 腾讯大讲堂:57 超级qq的千万之路
腾讯大讲堂:57 超级qq的千万之路
 
蔡学镛 Rebol漫谈
蔡学镛   Rebol漫谈蔡学镛   Rebol漫谈
蔡学镛 Rebol漫谈
 
赵泽欣 - 淘宝网前端应用与发展
赵泽欣 - 淘宝网前端应用与发展赵泽欣 - 淘宝网前端应用与发展
赵泽欣 - 淘宝网前端应用与发展
 
Yanggang wps
Yanggang wpsYanggang wps
Yanggang wps
 
熊节 - 软件工厂的精益之路
熊节 - 软件工厂的精益之路熊节 - 软件工厂的精益之路
熊节 - 软件工厂的精益之路
 
谢恩伟 - 微软在云端
谢恩伟 - 微软在云端谢恩伟 - 微软在云端
谢恩伟 - 微软在云端
 
去哪儿平台技术
去哪儿平台技术去哪儿平台技术
去哪儿平台技术
 
吴磊 - Silverlight企业级RIA
吴磊 - Silverlight企业级RIA吴磊 - Silverlight企业级RIA
吴磊 - Silverlight企业级RIA
 
Tom - Scrum
Tom - ScrumTom - Scrum
Tom - Scrum
 
Tim - FSharp
Tim - FSharpTim - FSharp
Tim - FSharp
 
Tiger oracle
Tiger oracleTiger oracle
Tiger oracle
 
Paulking groovy
Paulking groovyPaulking groovy
Paulking groovy
 
Paulking dlp
Paulking dlpPaulking dlp
Paulking dlp
 
Patrick jcp
Patrick jcpPatrick jcp
Patrick jcp
 
Nick twitter
Nick twitterNick twitter
Nick twitter
 
Kane debt
Kane debtKane debt
Kane debt
 

Último

The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
heathfieldcps1
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Krashi Coaching
 

Último (20)

IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 

Marc facebook

  • 1. memcache@facebook Marc Kwiatkowski memcache tech lead QCon Monday, April 12, 2010
  • 2. How big is facebook? Monday, April 12, 2010
  • 3. million active users Monday, April 12, 2010 400M Active
  • 4. million active users Title M M M M M M M Monday, April 12, 2010 400M Active
  • 5. Objects ▪ More than million status updates posted each day ▪ /s ▪ More than billion photos uploaded to the site each month ▪ /s ▪ More than billion pieces of content (web links, news stories, blog posts, notes, photo albums, etc.) shared each week ▪ K/s ▪ Average user has friends on the site ▪ Billion friend graph edges ▪ Average user clicks the Like button on pieces of content each month Monday, April 12, 2010
  • 6. - Infrastructure ▪ Thousands of servers in several data centers in two regions ▪ Web servers ▪ DB servers ▪ Memcache Servers ▪ Other services Monday, April 12, 2010
  • 7. The scale of memcache @ facebook ▪ Memcache Ops/s ▪ over M gets/sec ▪ over M sets/sec ▪ over T cached items ▪ over Tbytes ▪ Network IO ▪ peak rx Mpkts/s GB/s ▪ peak tx Mpkts/s GB/s Monday, April 12, 2010
  • 8. A typical memcache server’s P.O.V. ▪ Network I/O ▪ rx Kpkts/s . MB/s ▪ tx Kpkts/s MB/s ▪ Memcache OPS ▪ K gets/s ▪ K sets/s ▪ M items Monday, April 12, 2010 All rates are 1 day moving averages
  • 9. Evolution of facebook’s architecture Monday, April 12, 2010
  • 10. Monday, April 12, 2010 • When Mark Zuckerberg and his roommates started Facebook in a Harvard dorm in 2004, the put everyone on one server • Then as Facebook grew, they could scale like a traditional site by just adding servers • Even as the site grew beyond Harvard to Stanford, Columbia and thousands of other campuses, each was a separated network that could be served on an isolated set of servers • But as people connected more between schools connected, the model changed--and the big change came when Facebook opened to everyone in Sept. 2006 • [For globe]: That led to people being connected everywhere around the world--not just on a single college campus. • [For globe]: This visualization shows accepted friend requests animating from requesting friend to accepting friend
  • 11. Monday, April 12, 2010 • When Mark Zuckerberg and his roommates started Facebook in a Harvard dorm in 2004, the put everyone on one server • Then as Facebook grew, they could scale like a traditional site by just adding servers • Even as the site grew beyond Harvard to Stanford, Columbia and thousands of other campuses, each was a separated network that could be served on an isolated set of servers • But as people connected more between schools connected, the model changed--and the big change came when Facebook opened to everyone in Sept. 2006 • [For globe]: That led to people being connected everywhere around the world--not just on a single college campus. • [For globe]: This visualization shows accepted friend requests animating from requesting friend to accepting friend
  • 12. Monday, April 12, 2010 • When Mark Zuckerberg and his roommates started Facebook in a Harvard dorm in 2004, the put everyone on one server • Then as Facebook grew, they could scale like a traditional site by just adding servers • Even as the site grew beyond Harvard to Stanford, Columbia and thousands of other campuses, each was a separated network that could be served on an isolated set of servers • But as people connected more between schools connected, the model changed--and the big change came when Facebook opened to everyone in Sept. 2006 • [For globe]: That led to people being connected everywhere around the world--not just on a single college campus. • [For globe]: This visualization shows accepted friend requests animating from requesting friend to accepting friend
  • 13. Monday, April 12, 2010 • When Mark Zuckerberg and his roommates started Facebook in a Harvard dorm in 2004, the put everyone on one server • Then as Facebook grew, they could scale like a traditional site by just adding servers • Even as the site grew beyond Harvard to Stanford, Columbia and thousands of other campuses, each was a separated network that could be served on an isolated set of servers • But as people connected more between schools connected, the model changed--and the big change came when Facebook opened to everyone in Sept. 2006 • [For globe]: That led to people being connected everywhere around the world--not just on a single college campus. • [For globe]: This visualization shows accepted friend requests animating from requesting friend to accepting friend
  • 14. Monday, April 12, 2010 • When Mark Zuckerberg and his roommates started Facebook in a Harvard dorm in 2004, the put everyone on one server • Then as Facebook grew, they could scale like a traditional site by just adding servers • Even as the site grew beyond Harvard to Stanford, Columbia and thousands of other campuses, each was a separated network that could be served on an isolated set of servers • But as people connected more between schools connected, the model changed--and the big change came when Facebook opened to everyone in Sept. 2006 • [For globe]: That led to people being connected everywhere around the world--not just on a single college campus. • [For globe]: This visualization shows accepted friend requests animating from requesting friend to accepting friend
  • 15. Scaling Facebook: Interconnected data Bob Monday, April 12, 2010 •On Facebook, the data required to serve your home page or any other page s incredibly interconnected •Your data can’t sit on one server or cluster of servers because almost every piece of content on Facebook requires information about your network of friends •And the average user has 130 friends •As we scale, we have to be able to quickly pull data across all of our servers, wherever it’s stored.
  • 16. Scaling Facebook: Interconnected data Bob Brian Monday, April 12, 2010 •On Facebook, the data required to serve your home page or any other page s incredibly interconnected •Your data can’t sit on one server or cluster of servers because almost every piece of content on Facebook requires information about your network of friends •And the average user has 130 friends •As we scale, we have to be able to quickly pull data across all of our servers, wherever it’s stored.
  • 17. Scaling Facebook: Interconnected data Felicia Bob Brian Monday, April 12, 2010 •On Facebook, the data required to serve your home page or any other page s incredibly interconnected •Your data can’t sit on one server or cluster of servers because almost every piece of content on Facebook requires information about your network of friends •And the average user has 130 friends •As we scale, we have to be able to quickly pull data across all of our servers, wherever it’s stored.
  • 18. Memcache Rules of the Game ▪ GET object from memcache ▪ on miss, query database and SET object to memcache ▪ Update database row and DELETE object in memcache ▪ No derived objects in memcache ▪ Every memcache object maps to persisted data in database Monday, April 12, 2010
  • 22. Phatty Phatty Multiget (notes) ▪ PHP runtime is single threaded and synchronous ▪ To get good performance for data-parallel operations like retrieving info for all friends, it’s necessary to dispatch memcache get requests in parallel ▪ Initially we just used polling I/O in PHP. ▪ Later we switched to true asynchronous I/O in a PHP C extension ▪ In both case the result was reduced latency through parallelism. Monday, April 12, 2010
  • 23. Pools and Threads PHP Client Monday, April 12, 2010
  • 24. sp: cs: sp: cs: cs: sp: PHP Client Monday, April 12, 2010 Different objects have different sizes and access patterns. We began creating memcache pools to segregate different kinds of objects for better cache efficiency and memory utilization.
  • 25. sp: sp: sp: cs: cs: cs: PHP Client Monday, April 12, 2010 Different objects have different sizes and access patterns. We began creating memcache pools to segregate different kinds of objects for better cache efficiency and memory utilization.
  • 26. PHP Client Monday, April 12, 2010 Different objects have different sizes and access patterns. We began creating memcache pools to segregate different kinds of objects for better cache efficiency and memory utilization.
  • 27. Pools and Threads (notes) ▪ Privacy objects are small but have poor hit rates ▪ User-profiles are large but have good hit rates ▪ We achieve better overall caching by segregating different classes of objects into different pools of memcache servers ▪ Memcache was originally a classic single-threaded unix daemon ▪ This meant we needed to run instances with / the RAM on each memcache server ▪ X the number of connections to each both ▪ X the meta-data overhead ▪ We needed a multi-threaded service Monday, April 12, 2010
  • 28. Connections and Congestion ▪ [animation] Monday, April 12, 2010
  • 29. Connections and Congestion (notes) ▪ As we added web-servers the connections to each memcache box grew. ▪ Each webserver ran - PHP processes ▪ Each memcache box has K+ TCP connections ▪ UDP could reduce the number of connections ▪ As we added users and features, the number of keys per-multiget increased ▪ Popular people and groups ▪ Platform and FBML ▪ We began to see incast congestion on our ToR switches. Monday, April 12, 2010
  • 30. Serialization and Compression ▪ We noticed our short profiles weren’t so short ▪ K PHP serialized object ▪ fb-serialization ▪ based on thrift wire format ▪ X faster ▪ smaller ▪ gzcompress serialized strings Monday, April 12, 2010
  • 31. Multiple Datacenters SC Web SC Memcache SC MySQL Monday, April 12, 2010
  • 32. Multiple Datacenters SC Web SF Web SC SF Memcache Memcache SC MySQL Monday, April 12, 2010
  • 33. Multiple Datacenters SC Web SF Web Memcache Proxy Memcache Proxy SC SF Memcache Memcache SC MySQL Monday, April 12, 2010
  • 34. ▪Multiple Datacenters (notes) ▪ In the early days we had two data-centers ▪ The one we were about to turn off ▪ The one we were about to turn on ▪ Eventually we outgrew a single data-center ▪ Still only one master database tier ▪ Rules of the game require that after an update we need to broadcast deletes to all tiers ▪ The mcproxy era begins Monday, April 12, 2010
  • 35. Multiple Regions West Coast East Coast SC Web VA Web SC VA Memcache Memcache Memcache Proxy SC MySQL VA MySQL Monday, April 12, 2010
  • 36. Multiple Regions West Coast East Coast SC Web SF Web VA Web Memcache Proxy Memcache Proxy SC SF VA Memcache Memcache Memcache Memcache Proxy SC MySQL VA MySQL Monday, April 12, 2010
  • 37. Multiple Regions West Coast East Coast SC Web SF Web VA Web Memcache Proxy Memcache Proxy SC SF VA Memcache Memcache Memcache Memcache Proxy SC MySQL MySql replication VA MySQL Monday, April 12, 2010
  • 38. Multiple Regions (notes) ▪ Latency to east coast and European users was/is terrible. ▪ So we deployed a slave DB tier in Ashburn VA ▪ Slave DB tracks syncs with master via MySQL binlog ▪ This introduces a race condition ▪ mcproxy to the rescue again ▪ Add memcache delete pramga to MySQL update and insert ops ▪ Added thread to slave mysqld to dispatch deletes in east coast via mcpro Monday, April 12, 2010
  • 39. Replicated Keys Memcache Memcache Memcache PHP Client PHP Client PHP Client Monday, April 12, 2010
  • 40. Replicated Keys Memcache Memcache Memcache key PHP Client PHP Client PHP Client Monday, April 12, 2010
  • 41. Replicated Keys Memcache Memcache Memcache key key key PHP Client PHP Client PHP Client Monday, April 12, 2010
  • 42. Replicated Keys Memcache Memcache Memcache key key key PHP Client PHP Client PHP Client Monday, April 12, 2010
  • 43. Replicated Keys Memcache Memcache Memcache key PHP Client PHP Client PHP Client Monday, April 12, 2010
  • 44. Replicated Keys Memcache Memcache Memcache key# key# key# key PHP Client PHP Client PHP Client Monday, April 12, 2010
  • 45. Replicated Keys (notes) ▪ Viral groups and applications cause hot keys ▪ More gets than a single memcache server can process ▪ (Remember the rules of the game!) ▪ That means more queries than a single DB server can process ▪ That means that group or application is effectively down ▪ Creating key aliases allows us to add server capacity. ▪ Hot keys are published to all web-servers ▪ Each web-server picks an alias for gets ▪ get key:xxx => get key:xxx#N ▪ Each web-server deletes all aliases Monday, April 12, 2010
  • 46. Memcache Rules of the Game ▪ New Rule ▪ If a key is hot, pick an alias and fetch that for reads ▪ Delete all aliases on updates Monday, April 12, 2010
  • 47. Mirrored Pools Specialized Replica Specialized Replica Shard Shard Shard Shard General pool with wide fanout Shard Shard Shard Shard n ... Monday, April 12, 2010
  • 48. Mirrored Pools (notes) ▪ As our memcache tier grows the ratio of keys/packet decreases ▪ keys/ server = packet ▪ keys/ server = packets ▪ More network traffic ▪ More memcache server kernel interrupts per request ▪ Confirmed Info - critical account meta-data ▪ Have you confirmed your account? ▪ Are you a minor? ▪ Pulled from large user-profile objects ▪ Since we just need a few bytes of data for many users Monday, April 12, 2010
  • 49. Hot Misses ▪ [animation] Monday, April 12, 2010
  • 50. Hot Misses (notes) ▪ Remember the rules of the game ▪ update and delete ▪ miss, query, and set ▪ When the object is very, very popular, that query rate can kill a database server ▪ We need flow control! Monday, April 12, 2010
  • 51. Memcache Rules of the Game ▪ For hot keys, on miss grab a mutex before issuing db query ▪ memcache-add a per-object mutex ▪ key:xxx => key:xxx#mutex ▪ If add succeeds do the query ▪ If add fails (because mutex already exists) back-off and try again ▪ After set delete mutex Monday, April 12, 2010
  • 52. Hot Deletes ▪ [hot groups graphics] Monday, April 12, 2010
  • 53. Hot Deletes (notes) ▪ We’re not out of the woods yet ▪ Cache mutex doesn’t work for frequently updated objects ▪ like membership lists and walls for viral groups and applications. ▪ Each process that acquires a mutex finds that the object has been deleted again ▪ ...and again ▪ ...and again Monday, April 12, 2010
  • 54. Rules of the Game: Caching Intent ▪ Each memcache server is in the perfect position to detect and mitigate contention ▪ Record misses ▪ Record deletes ▪ Serve stale data ▪ Serve lease-ids ▪ Don’t allow updates without a valid lease id Monday, April 12, 2010
  • 56. Shaping Memcache Traffic ▪ mcproxy as router ▪ admission control ▪ tunneling inter-datacenter traffic Monday, April 12, 2010
  • 57. Cache Hierarchies ▪ Warming up Cold Clusters ▪ Proxies for Cacheless Clusters Monday, April 12, 2010
  • 58. Big Low Latency Clusters ▪ Bigger Clusters are Better ▪ Low Latency is Better ▪ L . ▪ UDP ▪ Proxy Facebook Architecture Monday, April 12, 2010
  • 59. Worse IS better ▪ Richard Gabriel’s famous essay contrasted ▪ ITS and Unix ▪ LISP and C ▪ MIT and New Jersey Monday, April 12, 2010 http://www.jwz.org/doc/worse-is-better.html
  • 60. Why Memcache Works ▪ Uniform, low latency with partial results is a better user experience ▪ memcache provides a few robust primitives ▪ key-to-server mapping ▪ parallel I/O ▪ flow-control ▪ traffic shaping ▪ that allow ad hoc solutions to a wide range of scaling issues Monday, April 12, 2010 We started with simple, obvious improvements. As we grew we deployed less obvious improvements... But they’ve remained pretty simple
  • 61. (c) Facebook, Inc. or its licensors.  "Facebook" is a registered trademark of Facebook, Inc.. All rights reserved. . Monday, April 12, 2010