SlideShare a Scribd company logo
1 of 20
Sharding Internals
     Eliot Horowitz
     @eliothorowitz
        MongoSV
    December 3, 2010
MongoDB Sharding

• Scale horizontally for data size, index size,
  write and consistent read scaling
• Distribute databases, collections or a
  objects in a collection
• Auto-balancing, migrations, management
  happen with no down time
• Choose how you partition data
• Can convert from single master to sharded
  system with no downtime
• Same features as non-sharding single
  master
• Fully consistent
Range Based
         MIN        MAX      LOCATION
            A        F        shard1
            F        M        shard1
            M        R        shard2
            R        Z        shard3




• collection is broken into chunks by range
• chunks default to 200mb or 100,000
  objects
Architecture
                     Shards
            mongod   mongod     mongod
                                               ...
 Config      mongod   mongod     mongod
 Servers

mongod

mongod

mongod               mongos    mongos    ...


                      client
Shards


• Can be master, master/slave or replica sets
• Replica sets gives sharding + full auto-
  failover
• Regular mongod processes
Config Servers

• 3 of them
• changes are made with 2 phase commit
• if any are down, meta data goes read only
• system is online as long as 1/3 is up
mongos

• Sharding Router
• Acts just like a mongod to clients
• Can have 1 or as many as you want
• Can run on appserver so no extra network
  traffic
• Cache meta data from config servers
Writes

• Inserts : require shard key, routed
• Removes: routed and/or scattered
• Updates: routed or scattered
Queries

• By shard key: routed
• sorted by shard key: routed in order
• by non shard key: scatter gather
• sorted by non shard key: distributed merge
  sort
Splitting

• Take a chunk and split it in 2
• Splits on the median value
• Splits only change meta data, no data
  change
Splitting
T1
     MIN      MAX   LOCATION
      A        Z      shard1


T2
     MIN      MAX   LOCATION
      A        G       shard1
      G        Z       shard1


T3
     MIN      MAX   LOCATION
      A        D      shard1
      D        G      shard1
      G        S      shard1
      S        Z      shard1
Balancing

• Moves chunks from one shard to another
• Done online while system is running
• Balancing runs in the background
Migrating
T3   MIN      MAX   LOCATION
      A        D      shard1
      D        G      shard1
      G        S      shard1
      S        Z      shard1

T4   MIN      MAX   LOCATION
      A        D      shard1
      D        G      shard1
      G        S      shard1
      S        Z     shard2
T5
     MIN      MAX   LOCATION
      A        D      shard1
      D        G      shard1
      G        S     shard2
      S        Z      shard2
Setting it Up
•   Start servers

•   add shards: db.runCommand( { addshard :
    "10.1.1.5" } )

•   turn on partitioning:
    db.runCommand( { enablesharding : "test" }

•   shard a collection:
    db.runCommand( { shardcollection : "test.data" ,
    key : { num : 1 } } )
User profiles

• Partition by user_id
• Secondary indexes on location, dates, etc...
• Reads/writes know which shard to hit
User Activity Stream

• Shard by user_id
• Loading a user’s stream hits a single shard
• Writes are distributed across all shards
• Can index on activity for deleting
Photos


• Can shard by photo_id for best read/write
  distribution
• Secondary index on tags, date
Logging

Possible Shard Keys


• date
• machine, date
• logger name
Download MongoDB
      http://www.mongodb.org



   and
let
us
know
what
you
think
    @eliothorowitz



@mongodb


       10gen is hiring!
http://www.10gen.com/jobs

More Related Content

Viewers also liked

รู้สิ่งใดไม่สู้...รู้งี้....
รู้สิ่งใดไม่สู้...รู้งี้....รู้สิ่งใดไม่สู้...รู้งี้....
รู้สิ่งใดไม่สู้...รู้งี้....Supasate Choochaisri
 
Choosing a Shard key
Choosing a Shard keyChoosing a Shard key
Choosing a Shard keyMongoDB
 
MongoDB Deployment Checklist
MongoDB Deployment ChecklistMongoDB Deployment Checklist
MongoDB Deployment ChecklistMongoDB
 
Lessons Learned from Building a Multi-Tenant Saas Content Management System o...
Lessons Learned from Building a Multi-Tenant Saas Content Management System o...Lessons Learned from Building a Multi-Tenant Saas Content Management System o...
Lessons Learned from Building a Multi-Tenant Saas Content Management System o...MongoDB
 
Sharding with MongoDB (Eliot Horowitz)
Sharding with MongoDB (Eliot Horowitz)Sharding with MongoDB (Eliot Horowitz)
Sharding with MongoDB (Eliot Horowitz)MongoSF
 
Eclipse Paho - MQTT and the Internet of Things
Eclipse Paho - MQTT and the Internet of ThingsEclipse Paho - MQTT and the Internet of Things
Eclipse Paho - MQTT and the Internet of ThingsAndy Piper
 
Securing MongoDB to Serve an AWS-Based, Multi-Tenant, Security-Fanatic SaaS A...
Securing MongoDB to Serve an AWS-Based, Multi-Tenant, Security-Fanatic SaaS A...Securing MongoDB to Serve an AWS-Based, Multi-Tenant, Security-Fanatic SaaS A...
Securing MongoDB to Serve an AWS-Based, Multi-Tenant, Security-Fanatic SaaS A...MongoDB
 
How to monitor MongoDB
How to monitor MongoDBHow to monitor MongoDB
How to monitor MongoDBServer Density
 
Efficient Pagination Using MySQL
Efficient Pagination Using MySQLEfficient Pagination Using MySQL
Efficient Pagination Using MySQLSurat Singh Bhati
 
Building Universal Web Apps with React ForwardJS 2017
Building Universal Web Apps with React ForwardJS 2017Building Universal Web Apps with React ForwardJS 2017
Building Universal Web Apps with React ForwardJS 2017Elyse Kolker Gordon
 
Pagination Done the Right Way
Pagination Done the Right WayPagination Done the Right Way
Pagination Done the Right WayMarkus Winand
 
Open Source IoT at Eclipse
Open Source IoT at EclipseOpen Source IoT at Eclipse
Open Source IoT at EclipseIan Skerrett
 
BigData_TP5 : Neo4J
BigData_TP5 : Neo4JBigData_TP5 : Neo4J
BigData_TP5 : Neo4JLilia Sfaxi
 
BigData_TP2: Design Patterns dans Hadoop
BigData_TP2: Design Patterns dans HadoopBigData_TP2: Design Patterns dans Hadoop
BigData_TP2: Design Patterns dans HadoopLilia Sfaxi
 
BigData_TP4 : Cassandra
BigData_TP4 : CassandraBigData_TP4 : Cassandra
BigData_TP4 : CassandraLilia Sfaxi
 
BigData_Chp5: Putting it all together
BigData_Chp5: Putting it all togetherBigData_Chp5: Putting it all together
BigData_Chp5: Putting it all togetherLilia Sfaxi
 
BigData_TP1: Initiation à Hadoop et Map-Reduce
BigData_TP1: Initiation à Hadoop et Map-ReduceBigData_TP1: Initiation à Hadoop et Map-Reduce
BigData_TP1: Initiation à Hadoop et Map-ReduceLilia Sfaxi
 
BigData_TP3 : Spark
BigData_TP3 : SparkBigData_TP3 : Spark
BigData_TP3 : SparkLilia Sfaxi
 
BigData_Chp3: Data Processing
BigData_Chp3: Data ProcessingBigData_Chp3: Data Processing
BigData_Chp3: Data ProcessingLilia Sfaxi
 
BigData_Chp2: Hadoop & Map-Reduce
BigData_Chp2: Hadoop & Map-ReduceBigData_Chp2: Hadoop & Map-Reduce
BigData_Chp2: Hadoop & Map-ReduceLilia Sfaxi
 

Viewers also liked (20)

รู้สิ่งใดไม่สู้...รู้งี้....
รู้สิ่งใดไม่สู้...รู้งี้....รู้สิ่งใดไม่สู้...รู้งี้....
รู้สิ่งใดไม่สู้...รู้งี้....
 
Choosing a Shard key
Choosing a Shard keyChoosing a Shard key
Choosing a Shard key
 
MongoDB Deployment Checklist
MongoDB Deployment ChecklistMongoDB Deployment Checklist
MongoDB Deployment Checklist
 
Lessons Learned from Building a Multi-Tenant Saas Content Management System o...
Lessons Learned from Building a Multi-Tenant Saas Content Management System o...Lessons Learned from Building a Multi-Tenant Saas Content Management System o...
Lessons Learned from Building a Multi-Tenant Saas Content Management System o...
 
Sharding with MongoDB (Eliot Horowitz)
Sharding with MongoDB (Eliot Horowitz)Sharding with MongoDB (Eliot Horowitz)
Sharding with MongoDB (Eliot Horowitz)
 
Eclipse Paho - MQTT and the Internet of Things
Eclipse Paho - MQTT and the Internet of ThingsEclipse Paho - MQTT and the Internet of Things
Eclipse Paho - MQTT and the Internet of Things
 
Securing MongoDB to Serve an AWS-Based, Multi-Tenant, Security-Fanatic SaaS A...
Securing MongoDB to Serve an AWS-Based, Multi-Tenant, Security-Fanatic SaaS A...Securing MongoDB to Serve an AWS-Based, Multi-Tenant, Security-Fanatic SaaS A...
Securing MongoDB to Serve an AWS-Based, Multi-Tenant, Security-Fanatic SaaS A...
 
How to monitor MongoDB
How to monitor MongoDBHow to monitor MongoDB
How to monitor MongoDB
 
Efficient Pagination Using MySQL
Efficient Pagination Using MySQLEfficient Pagination Using MySQL
Efficient Pagination Using MySQL
 
Building Universal Web Apps with React ForwardJS 2017
Building Universal Web Apps with React ForwardJS 2017Building Universal Web Apps with React ForwardJS 2017
Building Universal Web Apps with React ForwardJS 2017
 
Pagination Done the Right Way
Pagination Done the Right WayPagination Done the Right Way
Pagination Done the Right Way
 
Open Source IoT at Eclipse
Open Source IoT at EclipseOpen Source IoT at Eclipse
Open Source IoT at Eclipse
 
BigData_TP5 : Neo4J
BigData_TP5 : Neo4JBigData_TP5 : Neo4J
BigData_TP5 : Neo4J
 
BigData_TP2: Design Patterns dans Hadoop
BigData_TP2: Design Patterns dans HadoopBigData_TP2: Design Patterns dans Hadoop
BigData_TP2: Design Patterns dans Hadoop
 
BigData_TP4 : Cassandra
BigData_TP4 : CassandraBigData_TP4 : Cassandra
BigData_TP4 : Cassandra
 
BigData_Chp5: Putting it all together
BigData_Chp5: Putting it all togetherBigData_Chp5: Putting it all together
BigData_Chp5: Putting it all together
 
BigData_TP1: Initiation à Hadoop et Map-Reduce
BigData_TP1: Initiation à Hadoop et Map-ReduceBigData_TP1: Initiation à Hadoop et Map-Reduce
BigData_TP1: Initiation à Hadoop et Map-Reduce
 
BigData_TP3 : Spark
BigData_TP3 : SparkBigData_TP3 : Spark
BigData_TP3 : Spark
 
BigData_Chp3: Data Processing
BigData_Chp3: Data ProcessingBigData_Chp3: Data Processing
BigData_Chp3: Data Processing
 
BigData_Chp2: Hadoop & Map-Reduce
BigData_Chp2: Hadoop & Map-ReduceBigData_Chp2: Hadoop & Map-Reduce
BigData_Chp2: Hadoop & Map-Reduce
 

More from MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
 

More from MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Recently uploaded

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 

Recently uploaded (20)

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 

MongoDB Sharding Internals

  • 1. Sharding Internals Eliot Horowitz @eliothorowitz MongoSV December 3, 2010
  • 2. MongoDB Sharding • Scale horizontally for data size, index size, write and consistent read scaling • Distribute databases, collections or a objects in a collection • Auto-balancing, migrations, management happen with no down time
  • 3. • Choose how you partition data • Can convert from single master to sharded system with no downtime • Same features as non-sharding single master • Fully consistent
  • 4. Range Based MIN MAX LOCATION A F shard1 F M shard1 M R shard2 R Z shard3 • collection is broken into chunks by range • chunks default to 200mb or 100,000 objects
  • 5. Architecture Shards mongod mongod mongod ... Config mongod mongod mongod Servers mongod mongod mongod mongos mongos ... client
  • 6. Shards • Can be master, master/slave or replica sets • Replica sets gives sharding + full auto- failover • Regular mongod processes
  • 7. Config Servers • 3 of them • changes are made with 2 phase commit • if any are down, meta data goes read only • system is online as long as 1/3 is up
  • 8. mongos • Sharding Router • Acts just like a mongod to clients • Can have 1 or as many as you want • Can run on appserver so no extra network traffic • Cache meta data from config servers
  • 9. Writes • Inserts : require shard key, routed • Removes: routed and/or scattered • Updates: routed or scattered
  • 10. Queries • By shard key: routed • sorted by shard key: routed in order • by non shard key: scatter gather • sorted by non shard key: distributed merge sort
  • 11. Splitting • Take a chunk and split it in 2 • Splits on the median value • Splits only change meta data, no data change
  • 12. Splitting T1 MIN MAX LOCATION A Z shard1 T2 MIN MAX LOCATION A G shard1 G Z shard1 T3 MIN MAX LOCATION A D shard1 D G shard1 G S shard1 S Z shard1
  • 13. Balancing • Moves chunks from one shard to another • Done online while system is running • Balancing runs in the background
  • 14. Migrating T3 MIN MAX LOCATION A D shard1 D G shard1 G S shard1 S Z shard1 T4 MIN MAX LOCATION A D shard1 D G shard1 G S shard1 S Z shard2 T5 MIN MAX LOCATION A D shard1 D G shard1 G S shard2 S Z shard2
  • 15. Setting it Up • Start servers • add shards: db.runCommand( { addshard : "10.1.1.5" } ) • turn on partitioning: db.runCommand( { enablesharding : "test" } • shard a collection: db.runCommand( { shardcollection : "test.data" , key : { num : 1 } } )
  • 16. User profiles • Partition by user_id • Secondary indexes on location, dates, etc... • Reads/writes know which shard to hit
  • 17. User Activity Stream • Shard by user_id • Loading a user’s stream hits a single shard • Writes are distributed across all shards • Can index on activity for deleting
  • 18. Photos • Can shard by photo_id for best read/write distribution • Secondary index on tags, date
  • 19. Logging Possible Shard Keys • date • machine, date • logger name
  • 20. Download MongoDB http://www.mongodb.org and
let
us
know
what
you
think @eliothorowitz



@mongodb 10gen is hiring! http://www.10gen.com/jobs

Editor's Notes

  1. \n
  2. \n
  3. for inconsistent read scaling\n\n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. don’t shard by date\n\n
  21. \n
  22. \n
  23. \n