This document summarizes best practices for scaling MongoDB deployments. It discusses Behance's use of MongoDB for their activity feed, including moving from 40 nodes with 250M documents on ext3 to 60 nodes with 400M documents on ext4. It covers topics like sharding, replica sets, indexing, maintenance, and hardware considerations for large MongoDB clusters.
7. Why MongoDB?
• Easy to use.
• Easy to Iterate on.
• Devs like it.
• Fantastic Community built by 10Gen.
• “Fast.”
8. Why NOT MongoDB?
• Bleeding Edge.
• Not enough battle scars.
• Fewer tried and true fixes.
• No Transactional support.
9. Why MongoDB at Scale?
• Autosharding. (Stop hacking your app.)
• Smart Replica Sets / High Availability.
• Horizontal Scalability
• Easy to grow and shrink.
• Good fit for cloud*.
10. Why NOT MongoDB at Scale?
• Data can take up more space on disk.
• Disk IO in the cloud sucks.
• Database-level write lock.
• More Management than a MySQL cluster.
11. Behance’s Use Case + Fit
• Data is ephemeral.
• Denormalization of existing data.
• Fan Out Approach.
• Sharded by User.
12. Care & Feeding. Srsly?
• As a less mature DB, admins need to be a
bit more aware.
• No Different than MySQL (Need to take into
account memory, disk, usage patterns,
indexes)
• Watch Error Logs, Disk Use, Data Size, # of
Chunks, Old files, Sharding Status, Padding
Factors...
13. MongoDB Basics
MongoDB Docs are great, and always improving.
http://docs.mongodb.org/manual/
15. Profiling
• The profiler is equivalent to the slow log in
MySQL.
• Logs all operations slower than X seconds
to a collection.
// log slow operations, slow threshold=50ms
> db.setProfilingLevel(1,50)
// get operations that were slow.
db.system.profile.find( { millis : { $gt : 5 } } )
http://www.mongodb.org/display/DOCS/Database+Profiler
16. Explain
• Equivalent to MySQL’s EXPLAIN
• From Profiler, grab $query + $orderby, build
into real query.
// Explain a query
db.collection.find({ x: 1 }).explain()
http://www.mongodb.org/display/DOCS/Explain
17. Replica Sets
• Equivalent to MySQL’s replication, but not quite.
• Resiliency and availability through cleverness.
• ReplicaSet setups
• rs.stepDown()
• rs.slaveOk()
• w parameter
18. Replica Sets
// mongod.conf
replSet = myreplica
// Initiate the Replica Set
> rs.initiate()
//Add a node
> rs.add(“myreplica1:27017”);
// Allow reads from the secondaries
> rs.slaveOk()
// Write something.
> db.replica.insert({x:1});
// Make sure write propogates to majority of servers.
> db.runCommand( { getlasterror : 1 , w : "majority" } )
http://www.mongodb.org/display/DOCS/Replica+Set+Commands
http://docs.mongodb.org/manual/applications/replication/#read-preference
http://www.mongodb.org/display/DOCS/Verifying+Propagation+of+Writes+with+getLastError
19. Sharding
• Goal: Distribute data across many nodes / replica sets.
• Provides baked-in horizontal scalability.
20. Sharding
• Chunks.
• Routing process - mongos (manages balancing, and query routing)
• Shards - mongod configured with shardsvr = 1.
• Config servers - 3 mongod servers configured on mongos.conf
• Can be replica sets, or stand alone servers.
• Shard Key
21. Sharding
// mongos.conf
configdb = server,server,server
// Initiate
// connect to mongos the same way you would connect to mongod
> db.runCommand( { addshard : "<serverhostname>[:<port>]" } );
// Shard a collection
> db.runCommand( { shardcollection : "test.fs.chunks", key :
{ files_id : 1 } } )
http://www.mongodb.org/display/DOCS/Replica+Set+Commands
22. Indexing Big - Sharded Index
• Always run on mongos
• Background Indexing
• Sparse indexes
24. Gotchas
• A Sharded cluster with a shard down will
return no results.
• If a chunk has too much data dedicated to a
single shard key, and cannot split it, balancing
will become blocked, and the cluster will
become unbalanced.
25. Hardware / OS
• No knobs in MongoDB.
• Filesystem. (ext4 or xfs)
• Memory.
• Unix distro. Linux kernel >= 2.6.23
http://www.mongodb.org/display/DOCS/Production+Notes#ProductionNotes-LinuxFileSystems
I&#x2019;m Chris Henry, CTO of Behance.\n\nGoals of this class -> Learn a bit about MongoDB itself, and learn if MongoDB is the solution you want, learn how to deal with some pitfalls that aren&#x2019;t exactly clear in the docs....or y&#x2019;know, anywhere.\n\nThis is meant to be a conversation, stop me any time something isn&#x2019;t clear.\n
Ask.\n\nI&#x2019;ve been using MongoDB for 3 years now, for any number of failed projects no longer in existence.\n
Show the Activity Feed\n\nData Porn\n\n\n
\n
Show the Activity Feed\n\nData Porn\n\n\n
Easy to use -> installation on most systems is a single line. Updating is easy. There&#x2019;s a driver for basically every language. JSON - like Storage makes modeling easy.\n\nEasy to Iterate on -> Once modeled, making changes is really easy. Just add the key to the document you need. to remove, iterate through and use the $unset operator\n\nWhy is &#x201C;Fast&#x201D; in quotes? Just like anything piece of software, the way you manage it / deploy it / write code for it really determine how &#x201C;Fast&#x201D; it is.\n\n\n
Bleeding edge -> will cut you. Definitely production ready, but beware that you will need to devote serious time and effort, and will potentially have problems scaling.\n\nBattle scars -> software gets better by being in production for a long time. MySQL / Postgres all have the benefit of this. MongoDB is still the new kid in town.\n\nTried and true -> Many paradigms are document design are still developing.\n\nTxn support -> don&#x2019;t put important data that requires transaction support.\n\nGood news\n10Gen is a aware of most of Mongo&#x2019;s flaws, and is doing a stellar job of listening to the community and making changes.\n
Autosharding -> if your data gets too big, just add more capacity, without having any thing about the way your app connects to mongo.\n\nReplica Sets -> replicas of data that are smart enough to handle outages, and members disappearing.\n\nHorizontal -> Too much data? Just add more shards / nodes. More Nodes = More Scalability.\n\nCloud* -> Good fit for the on-demand, super fast provisioning of new instances. Why Asterisk? Shitty fit for Disk IO in cloud.\n
Data -> BSON format allows for much more flixibility, but as documents change size, they need to be moved, which takes up more space. Same problem as memcached.\n\nIO -> AWS has bad neighbor problem. Never sure who else on your virtualized machine will be thrashing the disk. When they do, writes take longer\n\nGlobal lock -> huge problem for write intensive standalone servers.In 2.2, this is changing to a collection level lock. In Behance&#x2019;s use case, this isn&#x2019;t really helpful, since we have one main collection.\n\nMgmt. -> Debatable. However, in our case, we have keep a much closer eye on data size, index size against available memory.\n
\n
Running any large Database cluster takes some work. However, Mongo seems to be a bit more on the needy side than MySQL.\n
\n
Get the Sterling Archer image here.\n\n\n
What&#x2019;s nice about keeping slow operations in a collection is that you can query them the same way you would query your collection.\n\n\nShard10-3 has profiling enabled. \n\ndb.system.profile.find( { millis : { $gt : 5 } } ).limit(1).pretty()\n
cleverness -> unlike MySQL, MongoDB replica nodes keep track of each other&#x2019;s state. If one goes down, an election is held between the rest of the nodes, and a new node is elected primary. Since all drivers will detect nodes in the set, writes are then directed there.\n\nsetups -> 2 Nodes + Arbiter OR 3 nodes\n\nrs.stepDown() -> force the primary to relinquish role as primary, and elect a secondary as primary\n\nslaveOk -> setting this parameter in the driver will send reads to the secondaries.\n\n\n
cleverness -> unlike MySQL, MongoDB replica nodes keep track of each other&#x2019;s state. If one goes down, an election is held between the rest of the nodes, and a new node is elected primary. Since all drivers will detect nodes in the set, writes are then directed there.\n\nsetups -> 2 Nodes + Arbiter OR 3 nodes\n\nrs.stepDown() -> force the primary to relinquish role as primary, and elect a secondary as primary\n\nslaveOk -> setting this parameter in the driver will send reads to the secondaries.\n\n\n
\n
\n
\n
&#xA0;- Beware: Backgrounding will index in the background on the primary, but in the foreground if on secondary. Use only primary when indexing. Do it at off peak hours.\n