SlideShare uma empresa Scribd logo
1 de 68
Baixar para ler offline
Introduction to Sharding
      Tyler Brock
      Software Engineer, 10gen




Wednesday, March 27, 13
Agenda

      • Scaling Data
      • MongoDB's Approach
      • Architecture
      • Configuration
      • Mechanics




Wednesday, March 27, 13
Scaling Data




Wednesday, March 27, 13
Examining Growth




Wednesday, March 27, 13
Examining Growth

      • User Growth
             – 1995: 0.4% of the world’s population
             – Today: 30% of the world is online (~2.2B)
             – Emerging Markets & Mobile




Wednesday, March 27, 13
Examining Growth

      • User Growth
             – 1995: 0.4% of the world’s population
             – Today: 30% of the world is online (~2.2B)
             – Emerging Markets & Mobile

      • Data Set Growth
             – Facebook’s data set is around 100 petabytes
             – 4 billion photos taken in the last year (4x a decade ago)




Wednesday, March 27, 13
Working Set Exceeds Physical Memory


Wednesday, March 27, 13
Read/Write Throughput Exceeds I/O


Wednesday, March 27, 13
Vertical Scalability (Scale Up)


Wednesday, March 27, 13
Horizontal Scalability (Scale Out)


Wednesday, March 27, 13
Data Store Scalability

      • Custom Hardware
             – Oracle

      • Custom Software
             – Facebook + MySQL
             – Google




Wednesday, March 27, 13
Data Store Scalability Today

      • MongoDB Auto-Sharding
      • A data store that is
             – Free
             – Publicly available
             – Open Source (https://github.com/mongodb/mongo)
             – Horizontally scalable
             – Application independent




Wednesday, March 27, 13
MongoDB's Approach to
      Sharding




Wednesday, March 27, 13
Partitioning

      • User defines shard key
      • Shard key defines range of data
      • Key space is like points on a line
      • Range is a segment of that line




Wednesday, March 27, 13
Data Distribution

      • Initially 1 chunk
      • Default max chunk size: 64mb
      • MongoDB automatically splits & migrates chunks
          when max reached




Wednesday, March 27, 13
Routing and Balancing

      • Queries routed to specific
          shards
      • MongoDB balances cluster
      • MongoDB migrates data to
          new nodes




Wednesday, March 27, 13
MongoDB Auto-Sharding

      • Minimal effort required
             – Same interface as single mongod

      • Two steps
             – Enable Sharding for a database
             – Shard collection within database




Wednesday, March 27, 13
Architecture




Wednesday, March 27, 13
What is a Shard?
      • Shard is a node of the cluster
      • Shard can be a single mongod or a replica set




Wednesday, March 27, 13
Meta Data Storage

      • Config Server
         – Stores cluster chunk ranges and locations
         – Can have only 1 or 3 (production must have 3)
         – Not a replica set




Wednesday, March 27, 13
Routing and Managing Data

      • Mongos
         – Acts as a router / balancer
         – No local data (persists to config database)
         – Can have 1 or many




Wednesday, March 27, 13
Sharding infrastructure


Wednesday, March 27, 13
Configuration




Wednesday, March 27, 13
Example Cluster




        • Don’t use this setup in production!
           - Only one Config server (No Fault Tolerance)
           - Shard not in a replica set (Low Availability)
           - Only one mongos and shard (No Performance Improvement)




Wednesday, March 27, 13
Starting the Configuration Server




        • mongod --configsvr
        • Starts a configuration server on the default port (27019)




Wednesday, March 27, 13
Start the mongos Router




        • mongos --configdb <hostname>:27019
        • For 3 configuration servers:
           mongos --configdb
           <host1>:<port1>,<host2>:<port2>,<host3>:<port3>
        • This is always how to start a new mongos, even if the cluster is
           already running


Wednesday, March 27, 13
Start the shard database




        •   mongod --shardsvr
        •   Starts a mongod with the default shard port (27018)
        •   Shard is not yet connected to the rest of the cluster
        •   Shard may have already been running in production



Wednesday, March 27, 13
Add the Shard




        • On mongos:
           - sh.addShard(‘<host>:27018’)
        • Adding a replica set:




Wednesday, March 27, 13
Verify that the shard was added




        • db.runCommand({ listshards:1 })
            { "shards" :
            ! [{"_id”: "shard0000”,"host”: ”<hostname>:27018” } ],
                 "ok" : 1
             }




Wednesday, March 27, 13
Enabling Sharding

      • Enable sharding on a database

             sh.enableSharding(“<dbname>”)

      • Shard a collection with the given key

             sh.shardCollection(“<dbname>.people”,{“country”:1})

      • Use a compound shard key to prevent duplicates

             sh.shardCollection(“<dbname>.cars”,{“year”:1, ”uniqueid”:1})




Wednesday, March 27, 13
Tag Aware Sharding

      • Tag aware sharding allows you to control the
          distribution of your data
      • Tag a range of shard keys
             – sh.addTagRange(<collection>,<min>,<max>,<tag>)

      • Tag a shard
             – sh.addShardTag(<shard>,<tag>)




Wednesday, March 27, 13
Mechanics




Wednesday, March 27, 13
Partitioning

      • Remember it's based on ranges




Wednesday, March 27, 13
Chunk is a section of the entire range


Wednesday, March 27, 13
Chunk splitting




     • A chunk is split once it exceeds the maximum size
     • There is no split point if all documents have the same shard key
     • Chunk split is a logical operation (no data is moved)




Wednesday, March 27, 13
Balancing




        • Balancer is running on mongos
        • Once the difference in chunks between the most dense shard
           and the least dense shard is above the migration threshold, a
           balancing round starts




Wednesday, March 27, 13
Acquiring the Balancer Lock




        • The balancer on mongos takes out a “balancer lock”
        • To see the status of these locks:
           use config
           db.locks.find({ _id: “balancer” })




Wednesday, March 27, 13
Moving the chunk




        • The mongos sends a moveChunk command to source shard
        • The source shard then notifies destination shard
        • Destination shard starts pulling documents from source shard




Wednesday, March 27, 13
Committing Migration




        • When complete, destination shard updates config server
           - Provides new locations of the chunks




Wednesday, March 27, 13
Cleanup




        • Source shard deletes moved data
           - Must wait for open cursors to either close or time out
           - NoTimeout cursors may prevent the release of the lock
        • The mongos releases the balancer lock after old chunks are




Wednesday, March 27, 13
Routing Requests




Wednesday, March 27, 13
Cluster Request Routing

      • Targeted Queries
      • Scatter Gather Queries
      • Scatter Gather Queries with Sort




Wednesday, March 27, 13
Cluster Request Routing: Targeted Query


Wednesday, March 27, 13
Routable request received


Wednesday, March 27, 13
Request routed to appropriate shard


Wednesday, March 27, 13
Shard returns results


Wednesday, March 27, 13
Mongos returns results to client


Wednesday, March 27, 13
Cluster Request Routing: Non-Targeted Query


Wednesday, March 27, 13
Non-Targeted Request Received


Wednesday, March 27, 13
Request sent to all shards


Wednesday, March 27, 13
Shards return results to mongos


Wednesday, March 27, 13
Mongos returns results to client


Wednesday, March 27, 13
Cluster Request Routing: Non-Targeted Query
      with Sort


Wednesday, March 27, 13
Non-Targeted request with sort received


Wednesday, March 27, 13
Request sent to all shards


Wednesday, March 27, 13
Query and sort performed locally


Wednesday, March 27, 13
Shards return results to mongos


Wednesday, March 27, 13
Mongos merges sorted results


Wednesday, March 27, 13
Mongos returns results to client


Wednesday, March 27, 13
Shard Key




Wednesday, March 27, 13
Shard Key

      • Shard key is immutable
      • Shard key values are immutable
      • Shard key must be indexed
      • Shard key limited to 512 bytes in size
      • Shard key used to route queries
             – Choose a field commonly used in queries

      • Only shard key can be unique across shards
             – `_id` field is only unique within individual shard



Wednesday, March 27, 13
Shard Key Considerations

      • Cardinality
      • Write Distribution
      • Query Isolation
      • Reliability
      • Index Locality




Wednesday, March 27, 13
Conclusion




Wednesday, March 27, 13
Read/Write Throughput Exceeds I/O


Wednesday, March 27, 13
Working Set Exceeds Physical Memory


Wednesday, March 27, 13
Sharding Enables Scale

      • MongoDB’s Auto-Sharding
             – Easy to Configure
             – Consistent Interface
             – Free and Open Source




Wednesday, March 27, 13
• What’s next?
             – [ Insert Related Talks ]
             – [ Insert Upcoming Webinars ]
             – MongoDB User Group

      • Resources
                  https://education.10gen.com/
                  http://www.10gen.com/presentations




Wednesday, March 27, 13
Thank You
      Tyler Brock
      Software Engineer, 10gen




Wednesday, March 27, 13

Mais conteúdo relacionado

Mais procurados

Mongodb - Scaling write performance
Mongodb - Scaling write performanceMongodb - Scaling write performance
Mongodb - Scaling write performanceDaum DNA
 
MongoDB Sharding
MongoDB ShardingMongoDB Sharding
MongoDB ShardingRob Walters
 
MongoDB Auto-Sharding at Mongo Seattle
MongoDB Auto-Sharding at Mongo SeattleMongoDB Auto-Sharding at Mongo Seattle
MongoDB Auto-Sharding at Mongo SeattleMongoDB
 
Lightning Talk: MongoDB Sharding
Lightning Talk: MongoDB ShardingLightning Talk: MongoDB Sharding
Lightning Talk: MongoDB ShardingMongoDB
 
I have a good shard key now what - Advanced Sharding
I have a good shard key now what - Advanced ShardingI have a good shard key now what - Advanced Sharding
I have a good shard key now what - Advanced ShardingDavid Murphy
 
Sharding in MongoDB Days 2013
Sharding in MongoDB Days 2013Sharding in MongoDB Days 2013
Sharding in MongoDB Days 2013Randall Hunt
 
MongoDB Deployment Checklist
MongoDB Deployment ChecklistMongoDB Deployment Checklist
MongoDB Deployment ChecklistMongoDB
 
Back to Basics Webinar 6: Production Deployment
Back to Basics Webinar 6: Production DeploymentBack to Basics Webinar 6: Production Deployment
Back to Basics Webinar 6: Production DeploymentMongoDB
 
Everything You Need to Know About Sharding
Everything You Need to Know About ShardingEverything You Need to Know About Sharding
Everything You Need to Know About ShardingMongoDB
 
MongoDB for Time Series Data: Sharding
MongoDB for Time Series Data: ShardingMongoDB for Time Series Data: Sharding
MongoDB for Time Series Data: ShardingMongoDB
 
Шардинг в MongoDB, Henrik Ingo (MongoDB)
Шардинг в MongoDB, Henrik Ingo (MongoDB)Шардинг в MongoDB, Henrik Ingo (MongoDB)
Шардинг в MongoDB, Henrik Ingo (MongoDB)Ontico
 
Sharding
ShardingSharding
ShardingMongoDB
 
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...A New MongoDB Sharding Architecture for Higher Availability and Better Resour...
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...leifwalsh
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to ShardingMongoDB
 
Exploring the replication in MongoDB
Exploring the replication in MongoDBExploring the replication in MongoDB
Exploring the replication in MongoDBIgor Donchovski
 
Evolution of MonogDB Sharding and Its Best Practices - Ranjith A - Mydbops Team
Evolution of MonogDB Sharding and Its Best Practices - Ranjith A - Mydbops TeamEvolution of MonogDB Sharding and Its Best Practices - Ranjith A - Mydbops Team
Evolution of MonogDB Sharding and Its Best Practices - Ranjith A - Mydbops TeamMydbops
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to ShardingMongoDB
 
Back to Basics 2017: Introduction to Sharding
Back to Basics 2017: Introduction to ShardingBack to Basics 2017: Introduction to Sharding
Back to Basics 2017: Introduction to ShardingMongoDB
 
Scaling with MongoDB
Scaling with MongoDBScaling with MongoDB
Scaling with MongoDBRick Copeland
 
MongoDB Capacity Planning
MongoDB Capacity PlanningMongoDB Capacity Planning
MongoDB Capacity PlanningNorberto Leite
 

Mais procurados (20)

Mongodb - Scaling write performance
Mongodb - Scaling write performanceMongodb - Scaling write performance
Mongodb - Scaling write performance
 
MongoDB Sharding
MongoDB ShardingMongoDB Sharding
MongoDB Sharding
 
MongoDB Auto-Sharding at Mongo Seattle
MongoDB Auto-Sharding at Mongo SeattleMongoDB Auto-Sharding at Mongo Seattle
MongoDB Auto-Sharding at Mongo Seattle
 
Lightning Talk: MongoDB Sharding
Lightning Talk: MongoDB ShardingLightning Talk: MongoDB Sharding
Lightning Talk: MongoDB Sharding
 
I have a good shard key now what - Advanced Sharding
I have a good shard key now what - Advanced ShardingI have a good shard key now what - Advanced Sharding
I have a good shard key now what - Advanced Sharding
 
Sharding in MongoDB Days 2013
Sharding in MongoDB Days 2013Sharding in MongoDB Days 2013
Sharding in MongoDB Days 2013
 
MongoDB Deployment Checklist
MongoDB Deployment ChecklistMongoDB Deployment Checklist
MongoDB Deployment Checklist
 
Back to Basics Webinar 6: Production Deployment
Back to Basics Webinar 6: Production DeploymentBack to Basics Webinar 6: Production Deployment
Back to Basics Webinar 6: Production Deployment
 
Everything You Need to Know About Sharding
Everything You Need to Know About ShardingEverything You Need to Know About Sharding
Everything You Need to Know About Sharding
 
MongoDB for Time Series Data: Sharding
MongoDB for Time Series Data: ShardingMongoDB for Time Series Data: Sharding
MongoDB for Time Series Data: Sharding
 
Шардинг в MongoDB, Henrik Ingo (MongoDB)
Шардинг в MongoDB, Henrik Ingo (MongoDB)Шардинг в MongoDB, Henrik Ingo (MongoDB)
Шардинг в MongoDB, Henrik Ingo (MongoDB)
 
Sharding
ShardingSharding
Sharding
 
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...A New MongoDB Sharding Architecture for Higher Availability and Better Resour...
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to Sharding
 
Exploring the replication in MongoDB
Exploring the replication in MongoDBExploring the replication in MongoDB
Exploring the replication in MongoDB
 
Evolution of MonogDB Sharding and Its Best Practices - Ranjith A - Mydbops Team
Evolution of MonogDB Sharding and Its Best Practices - Ranjith A - Mydbops TeamEvolution of MonogDB Sharding and Its Best Practices - Ranjith A - Mydbops Team
Evolution of MonogDB Sharding and Its Best Practices - Ranjith A - Mydbops Team
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to Sharding
 
Back to Basics 2017: Introduction to Sharding
Back to Basics 2017: Introduction to ShardingBack to Basics 2017: Introduction to Sharding
Back to Basics 2017: Introduction to Sharding
 
Scaling with MongoDB
Scaling with MongoDBScaling with MongoDB
Scaling with MongoDB
 
MongoDB Capacity Planning
MongoDB Capacity PlanningMongoDB Capacity Planning
MongoDB Capacity Planning
 

Semelhante a Sharding

Webinar: Sharding
Webinar: ShardingWebinar: Sharding
Webinar: ShardingMongoDB
 
Sharding - Seoul 2012
Sharding - Seoul 2012Sharding - Seoul 2012
Sharding - Seoul 2012MongoDB
 
Sharding
ShardingSharding
ShardingMongoDB
 
Sharding
ShardingSharding
ShardingMongoDB
 
What can we learn from NoSQL technologies?
What can we learn from NoSQL technologies?What can we learn from NoSQL technologies?
What can we learn from NoSQL technologies?Ivan Zoratti
 
Sharding Overview
Sharding OverviewSharding Overview
Sharding OverviewMongoDB
 
Shard-Query, an MPP database for the cloud using the LAMP stack
Shard-Query, an MPP database for the cloud using the LAMP stackShard-Query, an MPP database for the cloud using the LAMP stack
Shard-Query, an MPP database for the cloud using the LAMP stackJustin Swanhart
 
Back to Basics: Build Something Big With MongoDB
Back to Basics: Build Something Big With MongoDB Back to Basics: Build Something Big With MongoDB
Back to Basics: Build Something Big With MongoDB MongoDB
 
Back tobasicswebinar part6-rev.
Back tobasicswebinar part6-rev.Back tobasicswebinar part6-rev.
Back tobasicswebinar part6-rev.MongoDB
 
Webinar: Serie Operazioni per la vostra applicazione - Sessione 6 - Installar...
Webinar: Serie Operazioni per la vostra applicazione - Sessione 6 - Installar...Webinar: Serie Operazioni per la vostra applicazione - Sessione 6 - Installar...
Webinar: Serie Operazioni per la vostra applicazione - Sessione 6 - Installar...MongoDB
 
Cloud east shutl_talk
Cloud east shutl_talkCloud east shutl_talk
Cloud east shutl_talkVolker Pacher
 
Yet Another Replication Tool: RubyRep
Yet Another Replication Tool: RubyRepYet Another Replication Tool: RubyRep
Yet Another Replication Tool: RubyRepDenish Patel
 
Scaling MongoDB - Presentation at MTP
Scaling MongoDB - Presentation at MTPScaling MongoDB - Presentation at MTP
Scaling MongoDB - Presentation at MTPdarkdata
 
Back to Basics Spanish 4 Introduction to sharding
Back to Basics Spanish 4 Introduction to shardingBack to Basics Spanish 4 Introduction to sharding
Back to Basics Spanish 4 Introduction to shardingMongoDB
 
MongoDB Pros and Cons
MongoDB Pros and ConsMongoDB Pros and Cons
MongoDB Pros and Consjohnrjenson
 
Conceptos básicos. Seminario web 1: Introducción a NoSQL
Conceptos básicos. Seminario web 1: Introducción a NoSQLConceptos básicos. Seminario web 1: Introducción a NoSQL
Conceptos básicos. Seminario web 1: Introducción a NoSQLMongoDB
 
Fractal Tree Indexes : From Theory to Practice
Fractal Tree Indexes : From Theory to PracticeFractal Tree Indexes : From Theory to Practice
Fractal Tree Indexes : From Theory to PracticeTim Callaghan
 
Getting Started with MongoDB (TCF ITPC 2014)
Getting Started with MongoDB (TCF ITPC 2014)Getting Started with MongoDB (TCF ITPC 2014)
Getting Started with MongoDB (TCF ITPC 2014)Michael Redlich
 
MongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & PerformanceMongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & PerformanceSasidhar Gogulapati
 
Mongo db php_shaken_not_stirred_joomlafrappe
Mongo db php_shaken_not_stirred_joomlafrappeMongo db php_shaken_not_stirred_joomlafrappe
Mongo db php_shaken_not_stirred_joomlafrappeSpyros Passas
 

Semelhante a Sharding (20)

Webinar: Sharding
Webinar: ShardingWebinar: Sharding
Webinar: Sharding
 
Sharding - Seoul 2012
Sharding - Seoul 2012Sharding - Seoul 2012
Sharding - Seoul 2012
 
Sharding
ShardingSharding
Sharding
 
Sharding
ShardingSharding
Sharding
 
What can we learn from NoSQL technologies?
What can we learn from NoSQL technologies?What can we learn from NoSQL technologies?
What can we learn from NoSQL technologies?
 
Sharding Overview
Sharding OverviewSharding Overview
Sharding Overview
 
Shard-Query, an MPP database for the cloud using the LAMP stack
Shard-Query, an MPP database for the cloud using the LAMP stackShard-Query, an MPP database for the cloud using the LAMP stack
Shard-Query, an MPP database for the cloud using the LAMP stack
 
Back to Basics: Build Something Big With MongoDB
Back to Basics: Build Something Big With MongoDB Back to Basics: Build Something Big With MongoDB
Back to Basics: Build Something Big With MongoDB
 
Back tobasicswebinar part6-rev.
Back tobasicswebinar part6-rev.Back tobasicswebinar part6-rev.
Back tobasicswebinar part6-rev.
 
Webinar: Serie Operazioni per la vostra applicazione - Sessione 6 - Installar...
Webinar: Serie Operazioni per la vostra applicazione - Sessione 6 - Installar...Webinar: Serie Operazioni per la vostra applicazione - Sessione 6 - Installar...
Webinar: Serie Operazioni per la vostra applicazione - Sessione 6 - Installar...
 
Cloud east shutl_talk
Cloud east shutl_talkCloud east shutl_talk
Cloud east shutl_talk
 
Yet Another Replication Tool: RubyRep
Yet Another Replication Tool: RubyRepYet Another Replication Tool: RubyRep
Yet Another Replication Tool: RubyRep
 
Scaling MongoDB - Presentation at MTP
Scaling MongoDB - Presentation at MTPScaling MongoDB - Presentation at MTP
Scaling MongoDB - Presentation at MTP
 
Back to Basics Spanish 4 Introduction to sharding
Back to Basics Spanish 4 Introduction to shardingBack to Basics Spanish 4 Introduction to sharding
Back to Basics Spanish 4 Introduction to sharding
 
MongoDB Pros and Cons
MongoDB Pros and ConsMongoDB Pros and Cons
MongoDB Pros and Cons
 
Conceptos básicos. Seminario web 1: Introducción a NoSQL
Conceptos básicos. Seminario web 1: Introducción a NoSQLConceptos básicos. Seminario web 1: Introducción a NoSQL
Conceptos básicos. Seminario web 1: Introducción a NoSQL
 
Fractal Tree Indexes : From Theory to Practice
Fractal Tree Indexes : From Theory to PracticeFractal Tree Indexes : From Theory to Practice
Fractal Tree Indexes : From Theory to Practice
 
Getting Started with MongoDB (TCF ITPC 2014)
Getting Started with MongoDB (TCF ITPC 2014)Getting Started with MongoDB (TCF ITPC 2014)
Getting Started with MongoDB (TCF ITPC 2014)
 
MongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & PerformanceMongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & Performance
 
Mongo db php_shaken_not_stirred_joomlafrappe
Mongo db php_shaken_not_stirred_joomlafrappeMongo db php_shaken_not_stirred_joomlafrappe
Mongo db php_shaken_not_stirred_joomlafrappe
 

Mais de MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
 

Mais de MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Último

Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 

Último (20)

Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 

Sharding

  • 1. Introduction to Sharding Tyler Brock Software Engineer, 10gen Wednesday, March 27, 13
  • 2. Agenda • Scaling Data • MongoDB's Approach • Architecture • Configuration • Mechanics Wednesday, March 27, 13
  • 5. Examining Growth • User Growth – 1995: 0.4% of the world’s population – Today: 30% of the world is online (~2.2B) – Emerging Markets & Mobile Wednesday, March 27, 13
  • 6. Examining Growth • User Growth – 1995: 0.4% of the world’s population – Today: 30% of the world is online (~2.2B) – Emerging Markets & Mobile • Data Set Growth – Facebook’s data set is around 100 petabytes – 4 billion photos taken in the last year (4x a decade ago) Wednesday, March 27, 13
  • 7. Working Set Exceeds Physical Memory Wednesday, March 27, 13
  • 8. Read/Write Throughput Exceeds I/O Wednesday, March 27, 13
  • 9. Vertical Scalability (Scale Up) Wednesday, March 27, 13
  • 10. Horizontal Scalability (Scale Out) Wednesday, March 27, 13
  • 11. Data Store Scalability • Custom Hardware – Oracle • Custom Software – Facebook + MySQL – Google Wednesday, March 27, 13
  • 12. Data Store Scalability Today • MongoDB Auto-Sharding • A data store that is – Free – Publicly available – Open Source (https://github.com/mongodb/mongo) – Horizontally scalable – Application independent Wednesday, March 27, 13
  • 13. MongoDB's Approach to Sharding Wednesday, March 27, 13
  • 14. Partitioning • User defines shard key • Shard key defines range of data • Key space is like points on a line • Range is a segment of that line Wednesday, March 27, 13
  • 15. Data Distribution • Initially 1 chunk • Default max chunk size: 64mb • MongoDB automatically splits & migrates chunks when max reached Wednesday, March 27, 13
  • 16. Routing and Balancing • Queries routed to specific shards • MongoDB balances cluster • MongoDB migrates data to new nodes Wednesday, March 27, 13
  • 17. MongoDB Auto-Sharding • Minimal effort required – Same interface as single mongod • Two steps – Enable Sharding for a database – Shard collection within database Wednesday, March 27, 13
  • 19. What is a Shard? • Shard is a node of the cluster • Shard can be a single mongod or a replica set Wednesday, March 27, 13
  • 20. Meta Data Storage • Config Server – Stores cluster chunk ranges and locations – Can have only 1 or 3 (production must have 3) – Not a replica set Wednesday, March 27, 13
  • 21. Routing and Managing Data • Mongos – Acts as a router / balancer – No local data (persists to config database) – Can have 1 or many Wednesday, March 27, 13
  • 24. Example Cluster • Don’t use this setup in production! - Only one Config server (No Fault Tolerance) - Shard not in a replica set (Low Availability) - Only one mongos and shard (No Performance Improvement) Wednesday, March 27, 13
  • 25. Starting the Configuration Server • mongod --configsvr • Starts a configuration server on the default port (27019) Wednesday, March 27, 13
  • 26. Start the mongos Router • mongos --configdb <hostname>:27019 • For 3 configuration servers: mongos --configdb <host1>:<port1>,<host2>:<port2>,<host3>:<port3> • This is always how to start a new mongos, even if the cluster is already running Wednesday, March 27, 13
  • 27. Start the shard database • mongod --shardsvr • Starts a mongod with the default shard port (27018) • Shard is not yet connected to the rest of the cluster • Shard may have already been running in production Wednesday, March 27, 13
  • 28. Add the Shard • On mongos: - sh.addShard(‘<host>:27018’) • Adding a replica set: Wednesday, March 27, 13
  • 29. Verify that the shard was added • db.runCommand({ listshards:1 }) { "shards" : ! [{"_id”: "shard0000”,"host”: ”<hostname>:27018” } ], "ok" : 1 } Wednesday, March 27, 13
  • 30. Enabling Sharding • Enable sharding on a database sh.enableSharding(“<dbname>”) • Shard a collection with the given key sh.shardCollection(“<dbname>.people”,{“country”:1}) • Use a compound shard key to prevent duplicates sh.shardCollection(“<dbname>.cars”,{“year”:1, ”uniqueid”:1}) Wednesday, March 27, 13
  • 31. Tag Aware Sharding • Tag aware sharding allows you to control the distribution of your data • Tag a range of shard keys – sh.addTagRange(<collection>,<min>,<max>,<tag>) • Tag a shard – sh.addShardTag(<shard>,<tag>) Wednesday, March 27, 13
  • 33. Partitioning • Remember it's based on ranges Wednesday, March 27, 13
  • 34. Chunk is a section of the entire range Wednesday, March 27, 13
  • 35. Chunk splitting • A chunk is split once it exceeds the maximum size • There is no split point if all documents have the same shard key • Chunk split is a logical operation (no data is moved) Wednesday, March 27, 13
  • 36. Balancing • Balancer is running on mongos • Once the difference in chunks between the most dense shard and the least dense shard is above the migration threshold, a balancing round starts Wednesday, March 27, 13
  • 37. Acquiring the Balancer Lock • The balancer on mongos takes out a “balancer lock” • To see the status of these locks: use config db.locks.find({ _id: “balancer” }) Wednesday, March 27, 13
  • 38. Moving the chunk • The mongos sends a moveChunk command to source shard • The source shard then notifies destination shard • Destination shard starts pulling documents from source shard Wednesday, March 27, 13
  • 39. Committing Migration • When complete, destination shard updates config server - Provides new locations of the chunks Wednesday, March 27, 13
  • 40. Cleanup • Source shard deletes moved data - Must wait for open cursors to either close or time out - NoTimeout cursors may prevent the release of the lock • The mongos releases the balancer lock after old chunks are Wednesday, March 27, 13
  • 42. Cluster Request Routing • Targeted Queries • Scatter Gather Queries • Scatter Gather Queries with Sort Wednesday, March 27, 13
  • 43. Cluster Request Routing: Targeted Query Wednesday, March 27, 13
  • 45. Request routed to appropriate shard Wednesday, March 27, 13
  • 47. Mongos returns results to client Wednesday, March 27, 13
  • 48. Cluster Request Routing: Non-Targeted Query Wednesday, March 27, 13
  • 50. Request sent to all shards Wednesday, March 27, 13
  • 51. Shards return results to mongos Wednesday, March 27, 13
  • 52. Mongos returns results to client Wednesday, March 27, 13
  • 53. Cluster Request Routing: Non-Targeted Query with Sort Wednesday, March 27, 13
  • 54. Non-Targeted request with sort received Wednesday, March 27, 13
  • 55. Request sent to all shards Wednesday, March 27, 13
  • 56. Query and sort performed locally Wednesday, March 27, 13
  • 57. Shards return results to mongos Wednesday, March 27, 13
  • 58. Mongos merges sorted results Wednesday, March 27, 13
  • 59. Mongos returns results to client Wednesday, March 27, 13
  • 61. Shard Key • Shard key is immutable • Shard key values are immutable • Shard key must be indexed • Shard key limited to 512 bytes in size • Shard key used to route queries – Choose a field commonly used in queries • Only shard key can be unique across shards – `_id` field is only unique within individual shard Wednesday, March 27, 13
  • 62. Shard Key Considerations • Cardinality • Write Distribution • Query Isolation • Reliability • Index Locality Wednesday, March 27, 13
  • 64. Read/Write Throughput Exceeds I/O Wednesday, March 27, 13
  • 65. Working Set Exceeds Physical Memory Wednesday, March 27, 13
  • 66. Sharding Enables Scale • MongoDB’s Auto-Sharding – Easy to Configure – Consistent Interface – Free and Open Source Wednesday, March 27, 13
  • 67. • What’s next? – [ Insert Related Talks ] – [ Insert Upcoming Webinars ] – MongoDB User Group • Resources https://education.10gen.com/ http://www.10gen.com/presentations Wednesday, March 27, 13
  • 68. Thank You Tyler Brock Software Engineer, 10gen Wednesday, March 27, 13