SlideShare uma empresa Scribd logo
1 de 29
Baixar para ler offline
The Secret Sauce of Sharding


Ryan Thiessen
Database Operations
April 2011
Agenda
 1   Sharding 101

 2   Bad Sharding

 3   Facebook’s Universal Database

 4   Re-Sharding

 5   Operational Implications
Sharding 101
Bad news: there is no single way to shard
▪  What
      is the secret sauce of
 anything?
▪  Some   basic building blocks
▪  Moreabout what NOT to do rather
 than a specific recipe
▪  Wide   variation in implementation
Why not to shard your data
▪  Can’t   do JOINs inside the RDBMS across shards
▪  Data    denormalization has drawbacks
 ▪    Redundant storage
 ▪    Chore to keep everything in sync
▪  Ops   & Maintenance is harder
 ▪    Schema changes, are more difficult
 ▪    Monitoring challenges


▪  You   don’t do this because it’s cool, but because you have to
Why to shard your data
▪  Because   you have to
▪  Doing
     joins outside of the
 RDBMS isn’t that bad
▪  Less   contention on hot tables
▪  Continue   using commodity
 hardware
▪  Single
        instance failure affects
 only a small proportion of users
Basic building blocks of good sharding
▪  Shard   uniformity
 ▪    SKU, schema, queries
▪  Organize    shards according to data access patterns
 ▪    Picking the right key to shard on
▪  Ability   to grow, re-shard and shed load quickly
▪  Achieve    operational efficiencies of scale
Bad Sharding
“Sharding” by application
Bad sharding
▪  Example:   each application gets its own database
▪  Result:

 ▪    Data distribution is non-uniform, massive hot spots
 ▪    Every data access pattern is unique
 ▪    Very little efficiency of scale


  Commerce           User       Logging     Customer     Sales     Config
  Database         Database     Database    Database   Database   Database
Fixed hashing
Bad Sharding
▪  Example:   you have X instances
 ▪    Hashing algorithm splits data
      evenly across each
▪  Result:

 ▪    Unbalanced load, hot spots
 ▪    What to do about data
      growth?
 ▪    How do you re-shard and/or
      shed load?
Hyper-sharding
Bad Sharding
▪  Example: hash keys randomly
 across all instances, without any
 grouping
▪  Result:

 ▪    every fetch has to touch many
      shard to fulfills request
 ▪    Request latency becomes the
      max() of all shard latencies
 ▪    A single shard’s availability issue
      affects every request
How to choose a good shard key?
▪  Understand    how your applications will access your data
 ▪    Be careful of data distribution
▪  Example:   user ID
▪  Example:   time grouping
▪  Example:   random sharding
▪  TL;DR:   use the same methodology as picking a partition key
Facebook’s Universal Database
Multiple shards per physical host
Facebook UDB
▪  Multiple   database shards per MySQL instance
▪  Multiple   MySQL instances per host on different ports
▪  Each   shard has identical schemas




▪  This   enables web scale
Hashing
Facebook UDB
▪  Group   related objects together
 ▪    Collocate most user data on a single shard
 ▪    If an application has related objects, group them together
▪  Whenreferring to objects in a remote shard, store a reference to the
 object in both shards
▪  Multiple
          logical hashing schemes can co-exist over the same set
 physical hosts
Shard management service
Facebook UDB
▪  Methods:

 ▪    Map object IDs to logical (shard) IDs – procedural (simple hash)
 ▪    Map shard IDs to physical instances – manual
▪  Use   Thrift to access these methods from any language
▪  Distribute   shard metadata close to apps to reduce request latency
 ▪    Extremely read heavy
 ▪    Updated relatively infrequently
Example: fetching data from a shard
Facebook UDB
▪  Example:    application request to get data for object ID 12345678901
 ▪    Call a function: 12345678901 % 40000 => maps to shard 38901
▪  Resolve   shard ID 38901 to physical instances
        Instance       Repl Type     Region         Enabled
        db243:3306     master        A              enabled
        db533:3308     replica       A              enabled
        db874:3306     replica       B              disabled
        db983:3307     replica       B              enabled


▪  Application
            is in region B and only needs read, so prefer to return a
 connection to shard 38901 on instance db983:3307
Adding nodes
 Facebook UDB
 ▪  New    user pools
   ▪    List(s) of shard IDs where new objects go
   ▪    Reverse the hashing function, generate object ID which maps to one
        of the new ID pool shards
 ▪  Usually   new instances to add more overall capacity to the tier
 ▪  Can   be existing instances to get more utilization

 App requests              Get list of        Generate ID        Connect to the
storage on new             available         which maps to       selected shard,
     node               shards, pick one       that shard          save object
Re-Sharding
The Easy Way: shedding load
Re-Sharding
▪  Split   off logical dbs from a single MySQL instance

 Host1:3306      Host2:3306                               Host1:3306   Host2:3306
                                      Split

   ShardA          ShardA                                   ShardA       ShardB
                              1.  Block writes
                              2.  Break replication from    ShardC       ShardD
   ShardB          ShardB
                                  Host1->Host2
                              3.  Drop databases
   ShardC          ShardC     4.  Reconfigure Shard Manager
                                  to point to new instances
                              5.  Re-enable writes
   ShardD          ShardD

•  Splitting off instances running on different ports is easier
The Hard Way: double-write data
Re-Sharding
1.    Create new layout on all new instances
2.    On each new write, store in both places
3.    Separate process to backfill from the legacy storage
4.    Switch over reads to the new storage
5.    Monitor the old storage for reads
6.    Stop double-writes, drop old tables


▪  This   is I/O intensive and painful, but very possible
Operational Implications
Everything is harder
Operational Implications of sharding
▪  Monitoring    is harder
▪  Schema    changes are harder
▪  Upgrades    are harder
▪  Backups    and restores are harder
▪  Etc.   Seriously.
▪  “This   will probably never happen” will probably happen
▪  90%    of your time can be spent on 10% of the shards (or less)
Top-N monitoring
  Operational Implications
  ▪  Problemswith individual shards
   can get lost in the aggregate or
   mean
  ▪  Look
        at the worst “offenders”,
   identify outliers
  ▪  pmysqlis an excellent tool for
   doing this this quickly


$ cat hosts.txt | pmysql ‘show status like “threads_running”’ |
sort –k3 –n | tail –n20!
!
OHAI!
Uniformity of shards
Operational Implications
▪  Every
      shard should have the
 same schema
▪  Keeps the SKUs, configurations,
 etc, as consistent as possible
▪  Don’t
      scale shards by migrating
 the worst to better hardware
 ▪    Ops will have to keep track of
      this in the future
Application gating
Operational Implications
▪  Very
      easy for a bad application
 to consume all shard resources
▪  Limit
      per-shard concurrency for
 each application
 ▪    User limits are OK
 ▪    Admission control is better
▪  Log
     failures at both client and
 server levels
The Good News: efficiencies of scale
Operational Implications
▪  The   problems are hard, but there are solutions
▪  Fixing
      the problems of the worst shards usually also have benefit the
 median shards
▪  Loss   of a single shard is not the end of your website
▪  Easy   to safely test changes on a small subset
▪  Automationand tooling mean the team can debug and fix problems
 with high parallelism
(c) 2009 Facebook, Inc. or its licensors.  "Facebook" is a registered trademark of Facebook, Inc.. All rights reserved. 1.0

Mais conteúdo relacionado

Mais procurados

Mongodb sharding
Mongodb shardingMongodb sharding
Mongodb shardingxiangrong
 
Understanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsUnderstanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsAcunu
 
Building Applications with a Graph Database
Building Applications with a Graph DatabaseBuilding Applications with a Graph Database
Building Applications with a Graph DatabaseTobias Lindaaker
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to ShardingMongoDB
 
Development to Production with Sharded MongoDB Clusters
Development to Production with Sharded MongoDB ClustersDevelopment to Production with Sharded MongoDB Clusters
Development to Production with Sharded MongoDB ClustersSeveralnines
 
Managing Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al TobeyManaging Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al TobeyDataStax Academy
 
Sharding with MongoDB (Eliot Horowitz)
Sharding with MongoDB (Eliot Horowitz)Sharding with MongoDB (Eliot Horowitz)
Sharding with MongoDB (Eliot Horowitz)MongoSF
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to ShardingMongoDB
 
Choosing the right NOSQL database
Choosing the right NOSQL databaseChoosing the right NOSQL database
Choosing the right NOSQL databaseTobias Lindaaker
 
Everything You Need to Know About Sharding
Everything You Need to Know About ShardingEverything You Need to Know About Sharding
Everything You Need to Know About ShardingMongoDB
 
Webinar: Performance Tuning + Optimization
Webinar: Performance Tuning + OptimizationWebinar: Performance Tuning + Optimization
Webinar: Performance Tuning + OptimizationMongoDB
 
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016DataStax
 
Sharding
ShardingSharding
ShardingMongoDB
 
Mongodb - Scaling write performance
Mongodb - Scaling write performanceMongodb - Scaling write performance
Mongodb - Scaling write performanceDaum DNA
 
MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...
MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...
MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...MongoDB
 
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...DataStax
 
Basic Sharding in MongoDB presented by Shaun Verch
Basic Sharding in MongoDB presented by Shaun VerchBasic Sharding in MongoDB presented by Shaun Verch
Basic Sharding in MongoDB presented by Shaun VerchMongoDB
 
Sharding in MongoDB Days 2013
Sharding in MongoDB Days 2013Sharding in MongoDB Days 2013
Sharding in MongoDB Days 2013Randall Hunt
 
MongoDB Sharding
MongoDB ShardingMongoDB Sharding
MongoDB ShardingRob Walters
 
MongoDB Deployment Checklist
MongoDB Deployment ChecklistMongoDB Deployment Checklist
MongoDB Deployment ChecklistMongoDB
 

Mais procurados (20)

Mongodb sharding
Mongodb shardingMongodb sharding
Mongodb sharding
 
Understanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsUnderstanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problems
 
Building Applications with a Graph Database
Building Applications with a Graph DatabaseBuilding Applications with a Graph Database
Building Applications with a Graph Database
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to Sharding
 
Development to Production with Sharded MongoDB Clusters
Development to Production with Sharded MongoDB ClustersDevelopment to Production with Sharded MongoDB Clusters
Development to Production with Sharded MongoDB Clusters
 
Managing Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al TobeyManaging Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al Tobey
 
Sharding with MongoDB (Eliot Horowitz)
Sharding with MongoDB (Eliot Horowitz)Sharding with MongoDB (Eliot Horowitz)
Sharding with MongoDB (Eliot Horowitz)
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to Sharding
 
Choosing the right NOSQL database
Choosing the right NOSQL databaseChoosing the right NOSQL database
Choosing the right NOSQL database
 
Everything You Need to Know About Sharding
Everything You Need to Know About ShardingEverything You Need to Know About Sharding
Everything You Need to Know About Sharding
 
Webinar: Performance Tuning + Optimization
Webinar: Performance Tuning + OptimizationWebinar: Performance Tuning + Optimization
Webinar: Performance Tuning + Optimization
 
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
 
Sharding
ShardingSharding
Sharding
 
Mongodb - Scaling write performance
Mongodb - Scaling write performanceMongodb - Scaling write performance
Mongodb - Scaling write performance
 
MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...
MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...
MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...
 
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
 
Basic Sharding in MongoDB presented by Shaun Verch
Basic Sharding in MongoDB presented by Shaun VerchBasic Sharding in MongoDB presented by Shaun Verch
Basic Sharding in MongoDB presented by Shaun Verch
 
Sharding in MongoDB Days 2013
Sharding in MongoDB Days 2013Sharding in MongoDB Days 2013
Sharding in MongoDB Days 2013
 
MongoDB Sharding
MongoDB ShardingMongoDB Sharding
MongoDB Sharding
 
MongoDB Deployment Checklist
MongoDB Deployment ChecklistMongoDB Deployment Checklist
MongoDB Deployment Checklist
 

Destaque

The Shard Revisited: Tools and Techniques Used at Etsy
The Shard Revisited: Tools and Techniques Used at EtsyThe Shard Revisited: Tools and Techniques Used at Etsy
The Shard Revisited: Tools and Techniques Used at Etsyjgoulah
 
Database , 13 Replication
Database , 13 ReplicationDatabase , 13 Replication
Database , 13 ReplicationAli Usman
 
MySQL Indexing - Best practices for MySQL 5.6
MySQL Indexing - Best practices for MySQL 5.6MySQL Indexing - Best practices for MySQL 5.6
MySQL Indexing - Best practices for MySQL 5.6MYXPLAIN
 
Data Replication in Distributed System
Data Replication in  Distributed SystemData Replication in  Distributed System
Data Replication in Distributed SystemEhsan Hessami
 
Replication in Distributed Database
Replication in Distributed DatabaseReplication in Distributed Database
Replication in Distributed DatabaseAbhilasha Lahigude
 
Understanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache CassandraUnderstanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache CassandraDataStax
 

Destaque (10)

The Shard Revisited: Tools and Techniques Used at Etsy
The Shard Revisited: Tools and Techniques Used at EtsyThe Shard Revisited: Tools and Techniques Used at Etsy
The Shard Revisited: Tools and Techniques Used at Etsy
 
Partitioning
PartitioningPartitioning
Partitioning
 
Database , 13 Replication
Database , 13 ReplicationDatabase , 13 Replication
Database , 13 Replication
 
MySQL Indexing - Best practices for MySQL 5.6
MySQL Indexing - Best practices for MySQL 5.6MySQL Indexing - Best practices for MySQL 5.6
MySQL Indexing - Best practices for MySQL 5.6
 
Data Replication in Distributed System
Data Replication in  Distributed SystemData Replication in  Distributed System
Data Replication in Distributed System
 
Replication in Distributed Database
Replication in Distributed DatabaseReplication in Distributed Database
Replication in Distributed Database
 
Database Replication
Database ReplicationDatabase Replication
Database Replication
 
Understanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache CassandraUnderstanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache Cassandra
 
Telnet
TelnetTelnet
Telnet
 
TELNET Protocol
TELNET ProtocolTELNET Protocol
TELNET Protocol
 

Semelhante a MySQL Conference 2011 -- The Secret Sauce of Sharding -- Ryan Thiessen

From HDFS to S3: Migrate Pinterest Apache Spark Clusters
From HDFS to S3: Migrate Pinterest Apache Spark ClustersFrom HDFS to S3: Migrate Pinterest Apache Spark Clusters
From HDFS to S3: Migrate Pinterest Apache Spark ClustersDatabricks
 
Comparison of RDBMS, MongoDB, and Cassandra
Comparison of RDBMS, MongoDB, and CassandraComparison of RDBMS, MongoDB, and Cassandra
Comparison of RDBMS, MongoDB, and CassandraSunghyun Lee
 
Scaing databases on the cloud
Scaing databases on the cloudScaing databases on the cloud
Scaing databases on the cloudImaginea
 
Scaling Databases On The Cloud
Scaling Databases On The CloudScaling Databases On The Cloud
Scaling Databases On The CloudImaginea
 
Massively sharded my sql at tumblr presentation
Massively sharded my sql at tumblr presentationMassively sharded my sql at tumblr presentation
Massively sharded my sql at tumblr presentationkriptonium
 
Healthcare Claim Reimbursement using Apache Spark
Healthcare Claim Reimbursement using Apache SparkHealthcare Claim Reimbursement using Apache Spark
Healthcare Claim Reimbursement using Apache SparkDatabricks
 
MongoDB Days UK: Tales from the Field
MongoDB Days UK: Tales from the FieldMongoDB Days UK: Tales from the Field
MongoDB Days UK: Tales from the FieldMongoDB
 
Cassandra Talk: Austin JUG
Cassandra Talk: Austin JUGCassandra Talk: Austin JUG
Cassandra Talk: Austin JUGStu Hood
 
One to Many: The Story of Sharding at Box
One to Many: The Story of Sharding at BoxOne to Many: The Story of Sharding at Box
One to Many: The Story of Sharding at BoxFlorian Jourda
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentationEdward Capriolo
 
Migrating from MySQL to MongoDB
Migrating from MySQL to MongoDBMigrating from MySQL to MongoDB
Migrating from MySQL to MongoDBJames Carr
 
MongoDB: How We Did It – Reanimating Identity at AOL
MongoDB: How We Did It – Reanimating Identity at AOLMongoDB: How We Did It – Reanimating Identity at AOL
MongoDB: How We Did It – Reanimating Identity at AOLMongoDB
 
Next Generation Hadoop Operations
Next Generation Hadoop OperationsNext Generation Hadoop Operations
Next Generation Hadoop OperationsOwen O'Malley
 
Scylla Summit 2018: How We Made Large Partition Scans Over Two Times Faster
Scylla Summit 2018: How We Made Large Partition Scans Over Two Times FasterScylla Summit 2018: How We Made Large Partition Scans Over Two Times Faster
Scylla Summit 2018: How We Made Large Partition Scans Over Two Times FasterScyllaDB
 
Scaling MongoDB with Horizontal and Vertical Sharding
Scaling MongoDB with Horizontal and Vertical Sharding Scaling MongoDB with Horizontal and Vertical Sharding
Scaling MongoDB with Horizontal and Vertical Sharding Mydbops
 

Semelhante a MySQL Conference 2011 -- The Secret Sauce of Sharding -- Ryan Thiessen (20)

From HDFS to S3: Migrate Pinterest Apache Spark Clusters
From HDFS to S3: Migrate Pinterest Apache Spark ClustersFrom HDFS to S3: Migrate Pinterest Apache Spark Clusters
From HDFS to S3: Migrate Pinterest Apache Spark Clusters
 
Comparison of RDBMS, MongoDB, and Cassandra
Comparison of RDBMS, MongoDB, and CassandraComparison of RDBMS, MongoDB, and Cassandra
Comparison of RDBMS, MongoDB, and Cassandra
 
Scaing databases on the cloud
Scaing databases on the cloudScaing databases on the cloud
Scaing databases on the cloud
 
Scaling Databases On The Cloud
Scaling Databases On The CloudScaling Databases On The Cloud
Scaling Databases On The Cloud
 
Massively sharded my sql at tumblr presentation
Massively sharded my sql at tumblr presentationMassively sharded my sql at tumblr presentation
Massively sharded my sql at tumblr presentation
 
Healthcare Claim Reimbursement using Apache Spark
Healthcare Claim Reimbursement using Apache SparkHealthcare Claim Reimbursement using Apache Spark
Healthcare Claim Reimbursement using Apache Spark
 
MongoDB Days UK: Tales from the Field
MongoDB Days UK: Tales from the FieldMongoDB Days UK: Tales from the Field
MongoDB Days UK: Tales from the Field
 
Cassandra Talk: Austin JUG
Cassandra Talk: Austin JUGCassandra Talk: Austin JUG
Cassandra Talk: Austin JUG
 
One to Many: The Story of Sharding at Box
One to Many: The Story of Sharding at BoxOne to Many: The Story of Sharding at Box
One to Many: The Story of Sharding at Box
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentation
 
Migrating from MySQL to MongoDB
Migrating from MySQL to MongoDBMigrating from MySQL to MongoDB
Migrating from MySQL to MongoDB
 
MongoDB: How We Did It – Reanimating Identity at AOL
MongoDB: How We Did It – Reanimating Identity at AOLMongoDB: How We Did It – Reanimating Identity at AOL
MongoDB: How We Did It – Reanimating Identity at AOL
 
Next Generation Hadoop Operations
Next Generation Hadoop OperationsNext Generation Hadoop Operations
Next Generation Hadoop Operations
 
MySQL highav Availability
MySQL highav AvailabilityMySQL highav Availability
MySQL highav Availability
 
Introduction to Hadoop Administration
Introduction to Hadoop AdministrationIntroduction to Hadoop Administration
Introduction to Hadoop Administration
 
Introduction to Hadoop Administration
Introduction to Hadoop AdministrationIntroduction to Hadoop Administration
Introduction to Hadoop Administration
 
Scylla Summit 2018: How We Made Large Partition Scans Over Two Times Faster
Scylla Summit 2018: How We Made Large Partition Scans Over Two Times FasterScylla Summit 2018: How We Made Large Partition Scans Over Two Times Faster
Scylla Summit 2018: How We Made Large Partition Scans Over Two Times Faster
 
System Design.pdf
System Design.pdfSystem Design.pdf
System Design.pdf
 
Spark Tips & Tricks
Spark Tips & TricksSpark Tips & Tricks
Spark Tips & Tricks
 
Scaling MongoDB with Horizontal and Vertical Sharding
Scaling MongoDB with Horizontal and Vertical Sharding Scaling MongoDB with Horizontal and Vertical Sharding
Scaling MongoDB with Horizontal and Vertical Sharding
 

Último

How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 

Último (20)

How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 

MySQL Conference 2011 -- The Secret Sauce of Sharding -- Ryan Thiessen

  • 1.
  • 2. The Secret Sauce of Sharding Ryan Thiessen Database Operations April 2011
  • 3. Agenda 1 Sharding 101 2 Bad Sharding 3 Facebook’s Universal Database 4 Re-Sharding 5 Operational Implications
  • 5. Bad news: there is no single way to shard ▪  What is the secret sauce of anything? ▪  Some basic building blocks ▪  Moreabout what NOT to do rather than a specific recipe ▪  Wide variation in implementation
  • 6. Why not to shard your data ▪  Can’t do JOINs inside the RDBMS across shards ▪  Data denormalization has drawbacks ▪  Redundant storage ▪  Chore to keep everything in sync ▪  Ops & Maintenance is harder ▪  Schema changes, are more difficult ▪  Monitoring challenges ▪  You don’t do this because it’s cool, but because you have to
  • 7. Why to shard your data ▪  Because you have to ▪  Doing joins outside of the RDBMS isn’t that bad ▪  Less contention on hot tables ▪  Continue using commodity hardware ▪  Single instance failure affects only a small proportion of users
  • 8. Basic building blocks of good sharding ▪  Shard uniformity ▪  SKU, schema, queries ▪  Organize shards according to data access patterns ▪  Picking the right key to shard on ▪  Ability to grow, re-shard and shed load quickly ▪  Achieve operational efficiencies of scale
  • 10. “Sharding” by application Bad sharding ▪  Example: each application gets its own database ▪  Result: ▪  Data distribution is non-uniform, massive hot spots ▪  Every data access pattern is unique ▪  Very little efficiency of scale Commerce User Logging Customer Sales Config Database Database Database Database Database Database
  • 11. Fixed hashing Bad Sharding ▪  Example: you have X instances ▪  Hashing algorithm splits data evenly across each ▪  Result: ▪  Unbalanced load, hot spots ▪  What to do about data growth? ▪  How do you re-shard and/or shed load?
  • 12. Hyper-sharding Bad Sharding ▪  Example: hash keys randomly across all instances, without any grouping ▪  Result: ▪  every fetch has to touch many shard to fulfills request ▪  Request latency becomes the max() of all shard latencies ▪  A single shard’s availability issue affects every request
  • 13. How to choose a good shard key? ▪  Understand how your applications will access your data ▪  Be careful of data distribution ▪  Example: user ID ▪  Example: time grouping ▪  Example: random sharding ▪  TL;DR: use the same methodology as picking a partition key
  • 15. Multiple shards per physical host Facebook UDB ▪  Multiple database shards per MySQL instance ▪  Multiple MySQL instances per host on different ports ▪  Each shard has identical schemas ▪  This enables web scale
  • 16. Hashing Facebook UDB ▪  Group related objects together ▪  Collocate most user data on a single shard ▪  If an application has related objects, group them together ▪  Whenreferring to objects in a remote shard, store a reference to the object in both shards ▪  Multiple logical hashing schemes can co-exist over the same set physical hosts
  • 17. Shard management service Facebook UDB ▪  Methods: ▪  Map object IDs to logical (shard) IDs – procedural (simple hash) ▪  Map shard IDs to physical instances – manual ▪  Use Thrift to access these methods from any language ▪  Distribute shard metadata close to apps to reduce request latency ▪  Extremely read heavy ▪  Updated relatively infrequently
  • 18. Example: fetching data from a shard Facebook UDB ▪  Example: application request to get data for object ID 12345678901 ▪  Call a function: 12345678901 % 40000 => maps to shard 38901 ▪  Resolve shard ID 38901 to physical instances Instance Repl Type Region Enabled db243:3306 master A enabled db533:3308 replica A enabled db874:3306 replica B disabled db983:3307 replica B enabled ▪  Application is in region B and only needs read, so prefer to return a connection to shard 38901 on instance db983:3307
  • 19. Adding nodes Facebook UDB ▪  New user pools ▪  List(s) of shard IDs where new objects go ▪  Reverse the hashing function, generate object ID which maps to one of the new ID pool shards ▪  Usually new instances to add more overall capacity to the tier ▪  Can be existing instances to get more utilization App requests Get list of Generate ID Connect to the storage on new available which maps to selected shard, node shards, pick one that shard save object
  • 21. The Easy Way: shedding load Re-Sharding ▪  Split off logical dbs from a single MySQL instance Host1:3306 Host2:3306 Host1:3306 Host2:3306 Split ShardA ShardA ShardA ShardB 1.  Block writes 2.  Break replication from ShardC ShardD ShardB ShardB Host1->Host2 3.  Drop databases ShardC ShardC 4.  Reconfigure Shard Manager to point to new instances 5.  Re-enable writes ShardD ShardD •  Splitting off instances running on different ports is easier
  • 22. The Hard Way: double-write data Re-Sharding 1.  Create new layout on all new instances 2.  On each new write, store in both places 3.  Separate process to backfill from the legacy storage 4.  Switch over reads to the new storage 5.  Monitor the old storage for reads 6.  Stop double-writes, drop old tables ▪  This is I/O intensive and painful, but very possible
  • 24. Everything is harder Operational Implications of sharding ▪  Monitoring is harder ▪  Schema changes are harder ▪  Upgrades are harder ▪  Backups and restores are harder ▪  Etc. Seriously. ▪  “This will probably never happen” will probably happen ▪  90% of your time can be spent on 10% of the shards (or less)
  • 25. Top-N monitoring Operational Implications ▪  Problemswith individual shards can get lost in the aggregate or mean ▪  Look at the worst “offenders”, identify outliers ▪  pmysqlis an excellent tool for doing this this quickly $ cat hosts.txt | pmysql ‘show status like “threads_running”’ | sort –k3 –n | tail –n20! ! OHAI!
  • 26. Uniformity of shards Operational Implications ▪  Every shard should have the same schema ▪  Keeps the SKUs, configurations, etc, as consistent as possible ▪  Don’t scale shards by migrating the worst to better hardware ▪  Ops will have to keep track of this in the future
  • 27. Application gating Operational Implications ▪  Very easy for a bad application to consume all shard resources ▪  Limit per-shard concurrency for each application ▪  User limits are OK ▪  Admission control is better ▪  Log failures at both client and server levels
  • 28. The Good News: efficiencies of scale Operational Implications ▪  The problems are hard, but there are solutions ▪  Fixing the problems of the worst shards usually also have benefit the median shards ▪  Loss of a single shard is not the end of your website ▪  Easy to safely test changes on a small subset ▪  Automationand tooling mean the team can debug and fix problems with high parallelism
  • 29. (c) 2009 Facebook, Inc. or its licensors.  "Facebook" is a registered trademark of Facebook, Inc.. All rights reserved. 1.0