SlideShare uma empresa Scribd logo
1 de 35
Baixar para ler offline
Database Sharding
#BarCampGhent2
Tags
scaling, performance,
database, php, mySQL,
memcached, sphinx
echo “Hello, world!”;




 Jayme Rotsaert

    core developer @ Netlog

    since 2 years

 Jurriaan Persyn

    lead web developer @ Netlog

    since 3 years
What is sharding?




        A technique to scale databases
Some requirements @ Netlog ...




•   serve 36+ million unique users

•   4+ billion pageviews a month

•   huge amounts of data (eg. 100+ million
    friendships on nl.netlog.com)

•   write-heavy app (1.4/1 read-write ratio)

•   typical db up to 3000+ queries/sec (15h-22h)
Master
 r/w
Master
          w




Slave            Slave
  r                r
Top
               Master
                 w

Messages                    Friends
  r/w                         r/w
            Top      Top
           Slave    Slave
             r        r
Top
                     Master
                       w

   Messages                           Friends
      w                                  w
                  Top      Top
                 Slave    Slave
                   r        r
Msgs     Msgs                     Frnds    Frnds
Slave    Slave                    Slave    Slave
  r        r                        r        r
1040:
               Too many      Top
              connections   Master
                              w

   Messages                                  Friends
      w                                         w
                     Top          Top
                    Slave        Slave
                      r            r
Msgs     Msgs                            Frnds    Frnds
Slave    Slave                           Slave    Slave
  r        r                               r        r
1040:
                  Too many      Top
                 connections   Master      1040:
                                 w       Too many
    1040:                               connections
  Too many
    Messages
 connections                                          Friends
        w                                                w
                        Top          Top
                       Slave        Slave
                         r            r
Msgs        Msgs1040:                        Frnds         Frnds
              Too many
Slave       Slave
             connections                     Slave         Slave
  r           r                                r             r
?
Vertical
partitioning?
Master-to-master
  replication?
Caching?
Sharding!
Friends
                     %10 = 2
          Friends                 Friends
          %10 = 1                 %10 = 3
Friends                                     Friends
%10 = 0                                     %10 = 4
                    Aggregation
Friends                                     Friends
%10 = 9                                     %10 = 5
          Friends                 Friends
          %10 = 8                 %10 = 6
                     Friends
                     %10 = 7
More data?
More shards!
Existing solutions?




 •   MySQL NDB storage engine
     (sharded, not dynamic)

 •   memcached from Mysql
     (SQL-functions or storage engine)

 •   Oracle RAC

 •   HiveDB
     (mySQL sharding framework in Java)
Our solution




 • in-house
 • in php
 • middleware between application logic and
     class DB


 • typically carve shards by   $userID
sharddbhost001

    sharddb001            sharddb002

shard0001 shard0002   shard0005 shard0006



shard0003 shard0004   shard0007 shard0008
Overview of our Sharding implementation




 • Sharding Management
  • “DNS” System (the modulo function)
  • Balancer / Manager
 • Sharded Tables API
  • Database Access Layer
  • Caching Layer
Sharding Management “DNS”




 • “DNS” system translates      $userID   to the right db
    connection details

   •   $userID to $shardID DNS
       (via SQL/memcache - combination not
       fixed!)

   •   $shardIDto $hostname & $databasename
       (generated configuration files)
Sharded Tables API




 • Example API:
  • An object per         /      -combination
                     $tableName $userID


    • implementation of a class providing basic
         CRUD functions

      • typically a class for accessing database
         records with “a user’s items”
Some implications ...




  • No cross-shard (i.e. cross-user) SQL queries
   •            between sharded tables
        (LEFT) JOIN

       becomes impossibly complicated

    • It’s possible to design (parts of) application
       so there’s no need for cross-shard queries

  • Denormalize if you need
     $userID
                              SELECT   on other than


  • Data consistency
Some implications ... (2)




  •   Your DBA loves you again

      •   Smaller, thus faster tables

      •   Simpler, thus faster queries

  •   More atomic operations > better caching

  •   More PHP processing

      •   Needs memory

      •   PHP-webservers scale more easily
Some implications ... (3)




  •   $itemID   will only be unique in relation to $userID

  •   Downtime of a single databasehost affects
      only users on that DB
Sharding Management: Balancing Shards




 • Define ‘load’ percentage for shards (#users),
    databases (#users, #filesize), hosts (#sql
    reads, #sql writes, #cpu load, #users, ...)

 • Balance loads and start move operations
  • Done completely in PHP / transparant / no
       user downtime
What is memcached?




General-purpose distributed memory caching
Using memcached


function isSober($user)
{
	 $memcache = new Memcache();
	 $cacheKey = 'issober_' . $user->getUserID();
	 $result = $memcache->get($cacheKey); // fetch
	 if ($result === false)
	 {
	 	 // do some database heavy stuff
	 	 $result = (($user->getJobIndustry() == Industry::DEFENSE) &&
$location->isIn(City::get('NYC'))) ? quot;hammeredquot; : quot;soberquot;; //
whatever!
	 	
	 	 $memcache->set($cacheKey, $result, 0); // unlimited ttl
	 }
	 return $result;
}

var_dump(isSober(new User(quot;p.decremquot;))); // --> string(8) quot;hammeredquot;
memcached usage for Sharding




 • Typical usage:
  • Each sharded record is cached
         (key: table/userID/itemID)

     • Caches with lists, and caches with counts
         (key: where/order/...-clauses)

 •    Several caching modes:

     •   READ_INSERT_MODE


     •   READ_UPDATE_INSERT_MODE
CacheRevisionNumbers



 • What? Cached version number to use in other
     cache-keys

 • Why? Caching of counts / lists
 • Example: cache key for list of users latest
     photos (simplified):   ”USER_PHOTOS” . $userID .
     $cacheRevisionNumber . ”ORDERBYDATEADDDESCLIMIT10”;


 •   $cacheRevisionNumberis number, bumped on
     every CUD-action, clears caches of all counts
     +lists, else unlimited ttl.

 • “number” is current/cached timestamp
Sphinx Search



 • Problem:
    How do you give an overview of eg. latest
    photos from different users? (on different
    shards)

 • Solution:
    Check Jayme’s presentation “Sphinx search
    optimization”, distributed full text search.
    (Use it for more than searching!)
netlog.com/go/developer
jayme@netlog.com - jurriaan@netlog.com

Mais conteúdo relacionado

Semelhante a Database Sharding at Netlog

Practical Examples for Efficient I/O on Cray XT Systems (CUG 2009)
Practical Examples for Efficient I/O on Cray XT Systems (CUG 2009)Practical Examples for Efficient I/O on Cray XT Systems (CUG 2009)
Practical Examples for Efficient I/O on Cray XT Systems (CUG 2009)
Jeff Larkin
 

Semelhante a Database Sharding at Netlog (20)

Ruby on Rails in UbiSunrise
Ruby on Rails in UbiSunriseRuby on Rails in UbiSunrise
Ruby on Rails in UbiSunrise
 
WebCamp: Developer Day: The Big, the Small and the Redis - Андрей Савченко
WebCamp: Developer Day: The Big, the Small and the Redis - Андрей СавченкоWebCamp: Developer Day: The Big, the Small and the Redis - Андрей Савченко
WebCamp: Developer Day: The Big, the Small and the Redis - Андрей Савченко
 
Wichert Akkerman - Plone.Org Infrastructure
Wichert Akkerman - Plone.Org InfrastructureWichert Akkerman - Plone.Org Infrastructure
Wichert Akkerman - Plone.Org Infrastructure
 
Wichert Akkerman Plone Deployment Practices The Plone.Org Setup
Wichert Akkerman   Plone Deployment Practices   The Plone.Org SetupWichert Akkerman   Plone Deployment Practices   The Plone.Org Setup
Wichert Akkerman Plone Deployment Practices The Plone.Org Setup
 
Rails and Legacy Databases - RailsConf 2009
Rails and Legacy Databases - RailsConf 2009Rails and Legacy Databases - RailsConf 2009
Rails and Legacy Databases - RailsConf 2009
 
Compass, Sass, and the Enlightened CSS Developer
Compass, Sass, and the Enlightened CSS DeveloperCompass, Sass, and the Enlightened CSS Developer
Compass, Sass, and the Enlightened CSS Developer
 
Clojure at BackType
Clojure at BackTypeClojure at BackType
Clojure at BackType
 
When To Use Ruby On Rails
When To Use Ruby On RailsWhen To Use Ruby On Rails
When To Use Ruby On Rails
 
API's, Freebase, and the Collaborative Semantic web
API's, Freebase, and the Collaborative Semantic webAPI's, Freebase, and the Collaborative Semantic web
API's, Freebase, and the Collaborative Semantic web
 
A Re-Introduction to JavaScript
A Re-Introduction to JavaScriptA Re-Introduction to JavaScript
A Re-Introduction to JavaScript
 
Dynomite at Erlang Factory
Dynomite at Erlang FactoryDynomite at Erlang Factory
Dynomite at Erlang Factory
 
Perl University: Getting Started with Perl
Perl University: Getting Started with PerlPerl University: Getting Started with Perl
Perl University: Getting Started with Perl
 
Vidoop CouchDB Talk
Vidoop CouchDB TalkVidoop CouchDB Talk
Vidoop CouchDB Talk
 
Hiveminder - Everything but the Secret Sauce
Hiveminder - Everything but the Secret SauceHiveminder - Everything but the Secret Sauce
Hiveminder - Everything but the Secret Sauce
 
Practical Examples for Efficient I/O on Cray XT Systems (CUG 2009)
Practical Examples for Efficient I/O on Cray XT Systems (CUG 2009)Practical Examples for Efficient I/O on Cray XT Systems (CUG 2009)
Practical Examples for Efficient I/O on Cray XT Systems (CUG 2009)
 
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
 
Os Bowkett
Os BowkettOs Bowkett
Os Bowkett
 
Spark cassandra connector.API, Best Practices and Use-Cases
Spark cassandra connector.API, Best Practices and Use-CasesSpark cassandra connector.API, Best Practices and Use-Cases
Spark cassandra connector.API, Best Practices and Use-Cases
 
Using Spark's RDD APIs for complex, custom applications
Using Spark's RDD APIs for complex, custom applicationsUsing Spark's RDD APIs for complex, custom applications
Using Spark's RDD APIs for complex, custom applications
 
Juggling Chainsaws: Perl and MongoDB
Juggling Chainsaws: Perl and MongoDBJuggling Chainsaws: Perl and MongoDB
Juggling Chainsaws: Perl and MongoDB
 

Mais de Jurriaan Persyn

Mais de Jurriaan Persyn (6)

An Introduction to Elastic Search.
An Introduction to Elastic Search.An Introduction to Elastic Search.
An Introduction to Elastic Search.
 
Engagor Walkthrough
Engagor WalkthroughEngagor Walkthrough
Engagor Walkthrough
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
 
Developing Social Games in the Cloud
Developing Social Games in the CloudDeveloping Social Games in the Cloud
Developing Social Games in the Cloud
 
Meet the OpenSocial Containers: Netlog
Meet the OpenSocial Containers: NetlogMeet the OpenSocial Containers: Netlog
Meet the OpenSocial Containers: Netlog
 
Get Your Frontend Sorted (Barcamp Gent 2008)
Get Your Frontend Sorted (Barcamp Gent 2008)Get Your Frontend Sorted (Barcamp Gent 2008)
Get Your Frontend Sorted (Barcamp Gent 2008)
 

Último

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Último (20)

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 

Database Sharding at Netlog

  • 1.
  • 3. Tags scaling, performance, database, php, mySQL, memcached, sphinx
  • 4. echo “Hello, world!”; Jayme Rotsaert core developer @ Netlog since 2 years Jurriaan Persyn lead web developer @ Netlog since 3 years
  • 5. What is sharding? A technique to scale databases
  • 6. Some requirements @ Netlog ... • serve 36+ million unique users • 4+ billion pageviews a month • huge amounts of data (eg. 100+ million friendships on nl.netlog.com) • write-heavy app (1.4/1 read-write ratio) • typical db up to 3000+ queries/sec (15h-22h)
  • 8. Master w Slave Slave r r
  • 9. Top Master w Messages Friends r/w r/w Top Top Slave Slave r r
  • 10. Top Master w Messages Friends w w Top Top Slave Slave r r Msgs Msgs Frnds Frnds Slave Slave Slave Slave r r r r
  • 11. 1040: Too many Top connections Master w Messages Friends w w Top Top Slave Slave r r Msgs Msgs Frnds Frnds Slave Slave Slave Slave r r r r
  • 12. 1040: Too many Top connections Master 1040: w Too many 1040: connections Too many Messages connections Friends w w Top Top Slave Slave r r Msgs Msgs1040: Frnds Frnds Too many Slave Slave connections Slave Slave r r r r
  • 13. ?
  • 18. Friends %10 = 2 Friends Friends %10 = 1 %10 = 3 Friends Friends %10 = 0 %10 = 4 Aggregation Friends Friends %10 = 9 %10 = 5 Friends Friends %10 = 8 %10 = 6 Friends %10 = 7
  • 20. Existing solutions? • MySQL NDB storage engine (sharded, not dynamic) • memcached from Mysql (SQL-functions or storage engine) • Oracle RAC • HiveDB (mySQL sharding framework in Java)
  • 21. Our solution • in-house • in php • middleware between application logic and class DB • typically carve shards by $userID
  • 22. sharddbhost001 sharddb001 sharddb002 shard0001 shard0002 shard0005 shard0006 shard0003 shard0004 shard0007 shard0008
  • 23. Overview of our Sharding implementation • Sharding Management • “DNS” System (the modulo function) • Balancer / Manager • Sharded Tables API • Database Access Layer • Caching Layer
  • 24. Sharding Management “DNS” • “DNS” system translates $userID to the right db connection details • $userID to $shardID DNS (via SQL/memcache - combination not fixed!) • $shardIDto $hostname & $databasename (generated configuration files)
  • 25. Sharded Tables API • Example API: • An object per / -combination $tableName $userID • implementation of a class providing basic CRUD functions • typically a class for accessing database records with “a user’s items”
  • 26. Some implications ... • No cross-shard (i.e. cross-user) SQL queries • between sharded tables (LEFT) JOIN becomes impossibly complicated • It’s possible to design (parts of) application so there’s no need for cross-shard queries • Denormalize if you need $userID SELECT on other than • Data consistency
  • 27. Some implications ... (2) • Your DBA loves you again • Smaller, thus faster tables • Simpler, thus faster queries • More atomic operations > better caching • More PHP processing • Needs memory • PHP-webservers scale more easily
  • 28. Some implications ... (3) • $itemID will only be unique in relation to $userID • Downtime of a single databasehost affects only users on that DB
  • 29. Sharding Management: Balancing Shards • Define ‘load’ percentage for shards (#users), databases (#users, #filesize), hosts (#sql reads, #sql writes, #cpu load, #users, ...) • Balance loads and start move operations • Done completely in PHP / transparant / no user downtime
  • 30. What is memcached? General-purpose distributed memory caching
  • 31. Using memcached function isSober($user) { $memcache = new Memcache(); $cacheKey = 'issober_' . $user->getUserID(); $result = $memcache->get($cacheKey); // fetch if ($result === false) { // do some database heavy stuff $result = (($user->getJobIndustry() == Industry::DEFENSE) && $location->isIn(City::get('NYC'))) ? quot;hammeredquot; : quot;soberquot;; // whatever! $memcache->set($cacheKey, $result, 0); // unlimited ttl } return $result; } var_dump(isSober(new User(quot;p.decremquot;))); // --> string(8) quot;hammeredquot;
  • 32. memcached usage for Sharding • Typical usage: • Each sharded record is cached (key: table/userID/itemID) • Caches with lists, and caches with counts (key: where/order/...-clauses) • Several caching modes: • READ_INSERT_MODE • READ_UPDATE_INSERT_MODE
  • 33. CacheRevisionNumbers • What? Cached version number to use in other cache-keys • Why? Caching of counts / lists • Example: cache key for list of users latest photos (simplified): ”USER_PHOTOS” . $userID . $cacheRevisionNumber . ”ORDERBYDATEADDDESCLIMIT10”; • $cacheRevisionNumberis number, bumped on every CUD-action, clears caches of all counts +lists, else unlimited ttl. • “number” is current/cached timestamp
  • 34. Sphinx Search • Problem: How do you give an overview of eg. latest photos from different users? (on different shards) • Solution: Check Jayme’s presentation “Sphinx search optimization”, distributed full text search. (Use it for more than searching!)