Use MongoDB at Any Scale
As you scale, one of the challenges is optimizing your clusters and mitigating operational risk. Proper preparation can result in significant savings and reduced downtime.
This session covers:
* Deployment of dev/test/production environments across private data centers or public clouds
* What to monitor in production environments
* Management automation with ClusterControl from Severalnines
* How ClusterControl works with TokuMX
The session will give you the tools to more effectively manage your cluster, immediately. The presentation will include code samples and a live Q&A session.
This webinar is being delivered jointly by Severalnines & Tokutek. Severalnines provides automation and management tools to reduce the complexity of working with highly available database clusters. Tokutek provides high-performance and scalability for MongoDB, MySQL and MariaDB.
2. Confidential
Webinar Housekeeping
This webinar is being recorded
A link to the recording and to a copy of the slides
will be posted on tokutek.com
We welcome questions: enter questions into the
chat box and we will respond at the end of the
presentation
Think of something later?
Email Tokutek at contact@tokutek.com
Email Severalnines at info@severalnines.com
2
3. Confidential
Agenda
MongoDB & Automation
What is operational management ?
MongoDB Management caveats
Automation & Management by ClusterControl
Demo
3
Copyright Severalnines AB
7. Confidential
MongoDB and Automation
MongoDB is great for developers
MongoDB not as great for ops folks
Lack of operational tools
MMS Management: mainly a monitoring tool
MMS Automation: in alpha
Perhaps not surprising for a 5-yr old product
General-purpose tools can help some
E.g., Puppet, Chef
However…
7
Copyright Severalnines AB
8. Confidential
Drawback with Puppet or Chef
Puppet/Chef are appropriate for a group of single-node
components
E.g. webservers can be clones of each other..
Deploy 10 webservers, they all look the same..
Distributed databases are more complex
Different node types
Different roles and responsibilities
Specific order for procedures
Using e.g. Chef for deploying a distributed database
Yes, it is possible
How much Chef functionality is actually leveraged vs How
much code is written by user?
8
Copyright Severalnines AB
9. Confidential
What do Ops folks do?
- Deployment
Optimal hardware (CPU/RAM/Disk)
What topology to start with?
Virtualized or barebone? Cloud?
Multi-region or multi-AZ
Good initial configuration settings for DB
OS tuning (high dependency)
Monitoring the DB + underlying OS
Logging
9
Copyright Severalnines AB
10. Confidential
What do Ops folks do?
- performance monitoring
What do you do when the application is slow?
Is it Disk? CPU? RAM? Badly written queries?
What are the symptoms? (Replication Lag, Page Faults,
locks, # connections, …)
Do you need to scale?
How do you scale?
Capacity planning
10
Copyright Severalnines AB
12. Confidential
What do Ops folks do?
- Availability
Keep the service running
How do you detect something has failed?
Drilling down to root cause
Manual vs automatic failover
How do you avoid failures?
12
Copyright 2012 Severalnines AB
13. Confidential
What do Ops folks do?
- Management
Backup and Restore
Software upgrades and rolling restarts
Configuration changes
Adding nodes or shards
Rebalancing of shards
Compaction
13
Copyright Severalnines AB
15. Confidential
Management caveats (1/2)
1 Config server instead of 3
Starting 2 Config servers only not good enough
Read-only config – no changes in cluster state
No new shards can be added, no new users with userid/pwd, …
> 2 Routing Servers
1 router only is a SPOF
ReplicaSet: odd number of replicas
At least 3 to handle voting / network partitioning
To build a ReplicaSet, start with a first node. Use init on it. Add other nodes in
the ReplicaSet to it.
Sharding: pre-defined order for procedures
Start config servers (start with 1 node, then add the rest to it)
Start mongos (routers) (start with 1 router, then add more routers)
Build a ReplicaSet and add it as a shard
15
Copyright Severalnines AB
16. Confidential
Management caveats (2/2)
Backups
Lock a node, flush, then take a snapshot
For a sharded cluster, a bit more complicated
Config server data need to be saved
All shards backed up at same time for cluster-wide
snapshot
Rolling upgrades
Configuration change (e.g. moving a node to a more
powerful server), version upgrade/patch, …
E.g. 3 node replicaset, do not shut down 2 nodes. 3rd node
will become secondary/read-only.
Defragmentation, resharding, index rebuilds, etc.
16
Copyright Severalnines AB
17. Confidential
ClusterControl
Automation & Management
Provisioning
Auto deploy a Sharded Cluster in minutes
On-premise or in the cloud
Monitoring
1sec resolution
Both DB and OS stats
Realtime and historical
Management
Manage multiple clusters
Multi data-center
Automate failover, upgrades, backups,…
One-click scaling
17
Copyright Severalnines AB
19. Confidential
Demo - Manage multiple clusters thru
one pane of glass
19
Copyright Severalnines AB
Internal DataCenter
East Coast US
Internal Data Center
West Coast US Public Cloud
21. Confidential
Agenda
Common User Issues
What’s TokuMX™
What are the advantages
What should I monitor
How does Severalnines help
21
22. Confidential
Common Problems
I can’t ingest sources fast enough
My data is getting too big
I’m spending too much money on infrastructure
DB level locking is slowing me down
22
23. Confidential
What is TokuMX?
A open-source fork of MongoDB
Uses proprietary Fractal Trees
Keeps MongoDB APIs (no code change)
Replaces storage code
Builds off of 8+ years of MySQL development
23
24. Confidential
What are the Advantages?
Performance
Concurrency (doc level vs DB level)
Cache management (defined vs memory mapped)
Efficient index maintenance (No IO req’d [Fractal Tree])
Compression
Large blocks (4MB)
3 libraries (quicklz, zlib, lzma)
Flash friendly (<reads/writes)
Transactions
MVCC consistent reads (consistent snapshot of data)
Multistatement commit/rollback
24
25. Confidential
What Should I Monitor?
Mongo Performance
opcounters
Cache Use
Effectiveness of memory
Space
% full
Compression
Disk Utilization
What’s utilizing my disk(s)
25
31. Confidential
Disk Utilization
Can be tricky
No one thing causes IO
Helps to troubleshoot if you can narrow it to reads or
writes
Baselines can help decrease time to resolve
31
33. Confidential
Everything Else
TokuMX tends to trade IO utilization for CPU
Compression and decompression
FT maintenance
Just monitor your CPUs like any other resource
SeveralNines is exceptional at this…try it for yourself
db.serverStatus() is your friend
We’re moving interesting stats there to make it easier to monitor
33
***THIS IS ABNORMAL BEHAVIOR MEANT SPECIFICALLY FOR ILLUSTRATION!!