MongoDB presentation from Silicon Valley Code Camp 2014.
Walkthrough developing, deploying and operating a MongoDB application, avoiding the most common pitfalls.
1. #MongoDB
Advanced MongoDB
for Development, Deployment
and Operation
Daniel Coupal
Technical Services Engineer, Palo Alto, CA
Silicon Valley Code Camp 2014
2. 2
MongoDB Overview
400+ employees 1,000+ customers
13 offices around the world Over $231 million in funding
3. 3
This presentation is not …
• an introduction to MongoDB
First steps with MongoDB
by Nuri Halperin
5:00 PM Saturday
• about code examples
Beer Locker: Building a RESTful API with Node.js
by Scott Smith
2:45 PM Sunday
Get MEAN! MongoDb + express + angular + node
by Ward Bell
1:45 PM Saturday
Getting RESTless with MeteorJS and MongoDB in the browser
by Ryan Jarvinen
2:45 PM Sunday
4. 4
This presentation is about …
• Making you successful in developing,
deploying and operating an application with
MongoDB
• I do expect you to know the basics of
MongoDB.
• …even better if you already have an
application about to be deployed
5. 5
Agenda
1. Some Concepts
2. The Story of your Application
I. Prototype and Development
II. Deployment
III. Operation
3. Wrapping up
4. Q&A
7. 7
Some Concepts
• Oplog
• Working set
• MMS
• Collection scans
• Deployments/elections
8. 8
What is a Replica Set Oplog?
• A capped collection that stores an ordered
history of logical writes to a MongoDB
database
– Does not store operations like increment, add to set,
etc. Those are translated to the final document.
– Safe to replay old oplogs. Needs to play all of them in
the right order.
• Enables replication
• Enables backups
9. 9
Sizing the Oplog collection
• The capped collection dictates the amount
of hours a secondary/backup agent can stop
talking to the primary
• MMS Monitoring has
a Replication Oplog
Window graph
• Higher rate of writes
to the DBs requires a
larger Oplog collection
10. Working set
10
• Working Set: The total body of data+indexes
that the application uses in the course of
normal operation.
– http://docs.mongodb.org/manual/faq/storage/#what-is-the-
working-set
– MongoDB v2.4 added a working set estimator to the
serverStatus command
– http://docs.mongodb.org/manual/reference/command/
serverStatus/#serverStatus.workingSet
11. The MMS Components
A. Monitoring
1. Cloud: Sept 2011
2. On-Prem: July 2013
B. Backups
1. Cloud: April 2013
2. On-Prem: April 2014
C. Automation
11
1. Cloud: October 2014
16. Collection scan
16
• Very bad if you have a large collection
• One of the main performance issue see in our
customers’ application
• Can be identified in the logs with the ‘nscanned’
attribute on slow queries
17. Deployments/elections
17
• 3 data nodes
• If even number of data nodes, add an arbiter
– Don’t use more than one arbiter
• Many Data Centers or availability zones
• What is important for you?
=> can be chosen per operation
– Durability of writes
– Performance
19. I. Prototype and Development
19
1. Schema, schema, schema!
2. What happens when a failure is returned
by the database?
3. Index correctly
4. Incorporate testability in your application
5. Think about data sizing and growth
6. Performance Tuning
20. Think about data sizing and growth
20
• How much data will you have initially?
• How will your data set grow over time?
• How big is your working set?
• Will you be loading huge bulk inserts, or have a constant
stream of writes?
• How many reads and writes will you need to service per
second?
• What is the peak load you need to provision for?
21. Performance Tuning
1. Assess the problem and establish acceptable behavior
2. Measure the current performance
3. Find the bottleneck*
4. Remove the bottleneck
5. Re-test to confirm
6. Repeat
* - (This is often the hard part)
(Adapted from http://en.wikipedia.org/wiki/Performance_tuning )
21
22. II. Deploy
22
1. Deployment topology
2. Have a test/staging environment
– Track slow queries and collection scans
3. MongoDB production notes
– http://docs.mongodb.org/manual/administration/production-notes
4. Storage considerations
23. Storage considerations
23
• RAID
=> 0+1
• NAS, SAN or Direct Attached?
=> Direct Attached
• HDD or SSD
=> SSD, if budget permit
25. Disaster will strike
25
“Shit will happen!”
• Are you prepared?
• Have backups?
• Have a good picture of your “normal state”
26. Monitor
26
• iostat, top, vmstat, sar
• mongostat, mongotop
• MMS Monitoring
– Use Munin extensions
27. Upgrade
27
• Major versions have same binary format,
same protocol, etc
• MMS Automation handles automatic
upgrades
28. Comparing MongoDB backup approaches
28
Mongodump File system MMS Backup
Cloud
MMS Backup
On-Prem
Initial complexity Medium High Low High
System overhead High Low Low Medium
Point in time
Yes * No Yes Yes
recovery of replica
set
Consistent
snapshot of
sharded system
Yes * Yes * Yes Yes
Scalable No Yes Yes Yes
Restore time Slow Fast Medium Medium
* Possible, but need to write the tools and go though a lot of pain
30. Common Mistakes
30
1. Missing indexes
2. Not testing before deploying application changes
3. ulimits
a. number of open files => 64000
b. number of processes/threads => 64000
4. Appropriate schema
5. Hardware
a. right disks for the job
b. enough RAM
6. Not seeking help early enough
31. Resources
31
• MongoDB Professional Customer Support
– 24x7 support
– the sun never set on MongoDB Customer Support Team
• MongoDB Consulting Days
• MongoDB World (@NYC in June)
• MongoDB Days (@SF on Dec 3, 2014)
• MongoDB Office Hours
• Google Groups
33. Summary
33
• Use available resources
• Testing
– Plan for it, plan resources for it, do it before deploying
34. Take away
34
I hope you walk out of this presentation and
you make at least one single change in your
application, deployment, configuration, etc
that will prevent one issue from happening.
35. We hire
35
Positions open in Palo Alto, Austin and NYC
• http://www.mongodb.com/careers
Technical service engineer in Palo Alto
• http://www.mongodb.com/careers/
positions/technical-services-engineer