2. Why Distributed
Databases
• Vertical scaling is limited by physics and
cost
• Hard to scale vertically in the cloud
• Can scale wider than higher
3. Methods of
Distribution
• ad-hoc partitioning
• consistent hashing (Dynamo)
• range based partitioning (BigTable/PNUTS)
4. Mongo Sharding
• Can distrbute databases, collections or a
objects in a collection
• Choose how you partition data
• Balancing, migrations, management all
automatic
5. • range based
• Can convert from single master to sharded
system with 0 downtime
• Almost no functionality lost over single
master
• Fully consistent
8. Config Servers
• 3 of them
• changes are made with 2 phase commit
• if any are down, meta data goes read only
• system is online as long as 1/3 is up
9. Shards
• Can be master, master/slave or replica sets
• Replica sets gives sharding + full auto-
failover
• Regular mongod processes
10. mongos
• Sharding Router
• Acts just like a mongod to clients
• Can have 1 or as many as you want
• Can run on appserver so no extra network
traffic
12. Queries
• By shard key: routed
• sorted by shard key: routed in order
• by non shard key: scatter gather
• sorted by non shard key: distributed merge
sort
18. Download MongoDB
http://www.mongodb.org
After Party!
Bar 1920 (its close)
Don’t worry - world cup will
be on - first drink free.
and let us know what you think
@eliothorowitz @mongodb