SlideShare a Scribd company logo
1 of 25
Download to read offline
Approaching 1 Billion
       Documents in MongoDB




                     David Mytton
1/25      david@boxedice.com / www.mytton.net
Server Density Monitoring


       Processing       Database            UI




2/25
                    www.serverdensity.com
db.stats()
       Documents                     981,289,332

       Collections                         47,962

       Indexes                             39,684

       Data size                           369GB

       Index size                          241GB


3/25
                    As of 25th Apr 2010.
10 months




4/25
       Why we moved: http://bit.ly/mysqltomongo
Initial Setup

                     Replication




       Master                       Slave
         DC1                         DC2
       8GB RAM                     8GB RAM

5/25
Vertical Scaling

                  Replication




        Master                   Slave
         DC1                      DC2
       72GB RAM                 8GB RAM

6/25
Tip #1

       Keep your indexes in
       memory at all times.

           db.stats()


7/25
Manual Partitioning
                     Replication




          Master A                   Slave A
           DC1                        DC2
         16GB RAM                  16GB RAM


                     Replication




          Master B                   Slave B
           DC1                        DC2
8/25     16GB RAM                  16GB RAM
Database vs collections


       • Many databases = many data files (small but
         quickly get large).
       • Many collections = watch namespace limit.

9/25
Namespaces = Number of collections +
                number of indexes




10/25
Tip #2


        Monitor the 24,000
         namespace limit.


11/25
Using Server Density




12/25
Console

        db.system.namespaces.count()




13/25
Replica Pairs = Failover
                        Replica Pair




             Master A                    Slave A
              DC1                         DC2
            16GB RAM                   16GB RAM


                        Replica Pair




             Master B                    Slave B
              DC1                         DC2
14/25       16GB RAM                   16GB RAM
Tip #3


        Pre-provision your oplog files.



15/25
A shell script to generate 75GB oplog files




          for i in {0..40}
        do echo $i
        head -c 2146435072 /dev/zero > local.$i
        done




16/25
Tip #4


        Expect slower performance
         during initial replica sync.


17/25
Tip #5


        You can rotate your log files
             from the console.


18/25
Rotating your log files

         db.runCommand("logRotate")




19/25
Tip #6

            Index creation blocks by
            default. Use background
              indexing if necessary.



20/25
        MongoDB Manual: http://bit.ly/mongobgindex
Tip #7

         Increase your OS file
         descriptor limit + use
        persistent connections.


21/25
Too many open files!
                 /etc/security/limits.conf
         mongo hard nofile 10000
         mongo soft nofile 10000
          user                 type          limit




                  /etc/ssh/sshd_config

                     UsePAM yes
22/25
Space is not reused




23/25
Tip #8


        10gen commercial support is
             worth paying for.


24/25
Summary
        1. Keep indexes in memory.
        2. Monitor the 24k namespace limit.
        3. Pre-provision oplog files.
        4. Expect slower performance on replica sync.
        5. Rotate logs from the console.
        6. Index creation blocks by default.
        7. OS file descriptor limit + persistent connections.
25/25   8. Commercial support is worth it.

More Related Content

What's hot

Linux Kernel Extension for Databases / Александр Крижановский (Tempesta Techn...
Linux Kernel Extension for Databases / Александр Крижановский (Tempesta Techn...Linux Kernel Extension for Databases / Александр Крижановский (Tempesta Techn...
Linux Kernel Extension for Databases / Александр Крижановский (Tempesta Techn...
Ontico
 
MongoDB Memory Management Demystified
MongoDB Memory Management DemystifiedMongoDB Memory Management Demystified
MongoDB Memory Management Demystified
MongoDB
 

What's hot (20)

GlusterFS As an Object Storage
GlusterFS As an Object StorageGlusterFS As an Object Storage
GlusterFS As an Object Storage
 
Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)
Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)
Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)
 
Ops Jumpstart: Admin 101
Ops Jumpstart: Admin 101Ops Jumpstart: Admin 101
Ops Jumpstart: Admin 101
 
MySQL And Search At Craigslist
MySQL And Search At CraigslistMySQL And Search At Craigslist
MySQL And Search At Craigslist
 
Setting up mongodb sharded cluster in 30 minutes
Setting up mongodb sharded cluster in 30 minutesSetting up mongodb sharded cluster in 30 minutes
Setting up mongodb sharded cluster in 30 minutes
 
Linux Kernel Extension for Databases / Александр Крижановский (Tempesta Techn...
Linux Kernel Extension for Databases / Александр Крижановский (Tempesta Techn...Linux Kernel Extension for Databases / Александр Крижановский (Tempesta Techn...
Linux Kernel Extension for Databases / Александр Крижановский (Tempesta Techn...
 
MongoDB Memory Management Demystified
MongoDB Memory Management DemystifiedMongoDB Memory Management Demystified
MongoDB Memory Management Demystified
 
Making the case for write-optimized database algorithms / Mark Callaghan (Fac...
Making the case for write-optimized database algorithms / Mark Callaghan (Fac...Making the case for write-optimized database algorithms / Mark Callaghan (Fac...
Making the case for write-optimized database algorithms / Mark Callaghan (Fac...
 
[db tech showcase Tokyo 2017] A11: SQLite - The most used yet least appreciat...
[db tech showcase Tokyo 2017] A11: SQLite - The most used yet least appreciat...[db tech showcase Tokyo 2017] A11: SQLite - The most used yet least appreciat...
[db tech showcase Tokyo 2017] A11: SQLite - The most used yet least appreciat...
 
Redis深入浅出
Redis深入浅出Redis深入浅出
Redis深入浅出
 
Avoiding Data Hotspots at Scale
Avoiding Data Hotspots at ScaleAvoiding Data Hotspots at Scale
Avoiding Data Hotspots at Scale
 
Update on Crimson - the Seastarized Ceph - Seastar Summit
Update on Crimson  - the Seastarized Ceph - Seastar SummitUpdate on Crimson  - the Seastarized Ceph - Seastar Summit
Update on Crimson - the Seastarized Ceph - Seastar Summit
 
Realtime Search Infrastructure at Craigslist (OpenWest 2014)
Realtime Search Infrastructure at Craigslist (OpenWest 2014)Realtime Search Infrastructure at Craigslist (OpenWest 2014)
Realtime Search Infrastructure at Craigslist (OpenWest 2014)
 
Performance tuning in BlueStore & RocksDB - Li Xiaoyan
Performance tuning in BlueStore & RocksDB - Li XiaoyanPerformance tuning in BlueStore & RocksDB - Li Xiaoyan
Performance tuning in BlueStore & RocksDB - Li Xiaoyan
 
Evaluation of RBD replication options @CERN
Evaluation of RBD replication options @CERNEvaluation of RBD replication options @CERN
Evaluation of RBD replication options @CERN
 
Sphinx at Craigslist in 2012
Sphinx at Craigslist in 2012Sphinx at Craigslist in 2012
Sphinx at Craigslist in 2012
 
RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu Chai
RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu ChaiRADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu Chai
RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu Chai
 
Mongodb
MongodbMongodb
Mongodb
 
Redis persistence in practice
Redis persistence in practiceRedis persistence in practice
Redis persistence in practice
 
深入了解Redis
深入了解Redis深入了解Redis
深入了解Redis
 

Viewers also liked

MongoDB for Time Series Data Part 1: Setting the Stage for Sensor Management
MongoDB for Time Series Data Part 1: Setting the Stage for Sensor ManagementMongoDB for Time Series Data Part 1: Setting the Stage for Sensor Management
MongoDB for Time Series Data Part 1: Setting the Stage for Sensor Management
MongoDB
 

Viewers also liked (9)

You know, for search. Querying 24 Billion Documents in 900ms
You know, for search. Querying 24 Billion Documents in 900msYou know, for search. Querying 24 Billion Documents in 900ms
You know, for search. Querying 24 Billion Documents in 900ms
 
Living with SQL and NoSQL at craigslist, a Pragmatic Approach
Living with SQL and NoSQL at craigslist, a Pragmatic ApproachLiving with SQL and NoSQL at craigslist, a Pragmatic Approach
Living with SQL and NoSQL at craigslist, a Pragmatic Approach
 
MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)
MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)
MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)
 
Midas - on-the-fly schema migration tool for MongoDB.
Midas - on-the-fly schema migration tool for MongoDB.Midas - on-the-fly schema migration tool for MongoDB.
Midas - on-the-fly schema migration tool for MongoDB.
 
Probabilistic algorithms for fun and pseudorandom profit
Probabilistic algorithms for fun and pseudorandom profitProbabilistic algorithms for fun and pseudorandom profit
Probabilistic algorithms for fun and pseudorandom profit
 
Benchmark slideshow
Benchmark slideshowBenchmark slideshow
Benchmark slideshow
 
MongoDB for Time Series Data
MongoDB for Time Series DataMongoDB for Time Series Data
MongoDB for Time Series Data
 
MongoDB for Time Series Data Part 1: Setting the Stage for Sensor Management
MongoDB for Time Series Data Part 1: Setting the Stage for Sensor ManagementMongoDB for Time Series Data Part 1: Setting the Stage for Sensor Management
MongoDB for Time Series Data Part 1: Setting the Stage for Sensor Management
 
Scaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - SematextScaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - Sematext
 

Similar to Webinar - Approaching 1 billion documents with MongoDB

Db As Behaving Badly... Worst Practices For Database Administrators Rod Colledge
Db As Behaving Badly... Worst Practices For Database Administrators Rod ColledgeDb As Behaving Badly... Worst Practices For Database Administrators Rod Colledge
Db As Behaving Badly... Worst Practices For Database Administrators Rod Colledge
sqlserver.co.il
 
IBM flash systems
IBM flash systems IBM flash systems
IBM flash systems
Solv AS
 
Collaborate instant cloning_kyle
Collaborate instant cloning_kyleCollaborate instant cloning_kyle
Collaborate instant cloning_kyle
Kyle Hailey
 
Storage and performance- Batch processing, Whiptail
Storage and performance- Batch processing, WhiptailStorage and performance- Batch processing, Whiptail
Storage and performance- Batch processing, Whiptail
Internet World
 
Webinar slides: ClusterControl 1.4: The MySQL Replication & MongoDB Edition -...
Webinar slides: ClusterControl 1.4: The MySQL Replication & MongoDB Edition -...Webinar slides: ClusterControl 1.4: The MySQL Replication & MongoDB Edition -...
Webinar slides: ClusterControl 1.4: The MySQL Replication & MongoDB Edition -...
Severalnines
 

Similar to Webinar - Approaching 1 billion documents with MongoDB (20)

Db As Behaving Badly... Worst Practices For Database Administrators Rod Colledge
Db As Behaving Badly... Worst Practices For Database Administrators Rod ColledgeDb As Behaving Badly... Worst Practices For Database Administrators Rod Colledge
Db As Behaving Badly... Worst Practices For Database Administrators Rod Colledge
 
MySQL Replication Update -- Zendcon 2016
MySQL Replication Update -- Zendcon 2016MySQL Replication Update -- Zendcon 2016
MySQL Replication Update -- Zendcon 2016
 
[db tech showcase Tokyo 2018] #dbts2018 #B17 『オラクル パフォーマンス チューニング - 神話、伝説と解決策』
[db tech showcase Tokyo 2018] #dbts2018 #B17 『オラクル パフォーマンス チューニング - 神話、伝説と解決策』[db tech showcase Tokyo 2018] #dbts2018 #B17 『オラクル パフォーマンス チューニング - 神話、伝説と解決策』
[db tech showcase Tokyo 2018] #dbts2018 #B17 『オラクル パフォーマンス チューニング - 神話、伝説と解決策』
 
DB2 Design for High Availability and Scalability
DB2 Design for High Availability and ScalabilityDB2 Design for High Availability and Scalability
DB2 Design for High Availability and Scalability
 
Cloudcon East Presentation
Cloudcon East PresentationCloudcon East Presentation
Cloudcon East Presentation
 
Cloudcon East Presentation
Cloudcon East PresentationCloudcon East Presentation
Cloudcon East Presentation
 
IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...
IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...
IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...
 
SDC20 ScaleFlux.pptx
SDC20 ScaleFlux.pptxSDC20 ScaleFlux.pptx
SDC20 ScaleFlux.pptx
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDB
 
Retour d'expérience d'un environnement base de données multitenant
Retour d'expérience d'un environnement base de données multitenantRetour d'expérience d'un environnement base de données multitenant
Retour d'expérience d'un environnement base de données multitenant
 
Troubleshooting Redis- DaeMyung Kang, Kakao
Troubleshooting Redis- DaeMyung Kang, KakaoTroubleshooting Redis- DaeMyung Kang, Kakao
Troubleshooting Redis- DaeMyung Kang, Kakao
 
Troubleshooting redis
Troubleshooting redisTroubleshooting redis
Troubleshooting redis
 
Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...
Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...
Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...
 
IBM flash systems
IBM flash systems IBM flash systems
IBM flash systems
 
DbB 10 Webcast #3 The Secrets Of Scalability
DbB 10 Webcast #3   The Secrets Of ScalabilityDbB 10 Webcast #3   The Secrets Of Scalability
DbB 10 Webcast #3 The Secrets Of Scalability
 
Loadays MySQL
Loadays MySQLLoadays MySQL
Loadays MySQL
 
Collaborate instant cloning_kyle
Collaborate instant cloning_kyleCollaborate instant cloning_kyle
Collaborate instant cloning_kyle
 
Storage and performance- Batch processing, Whiptail
Storage and performance- Batch processing, WhiptailStorage and performance- Batch processing, Whiptail
Storage and performance- Batch processing, Whiptail
 
Nagios Conference 2012 - Dan Wittenberg - Case Study: Scaling Nagios Core at ...
Nagios Conference 2012 - Dan Wittenberg - Case Study: Scaling Nagios Core at ...Nagios Conference 2012 - Dan Wittenberg - Case Study: Scaling Nagios Core at ...
Nagios Conference 2012 - Dan Wittenberg - Case Study: Scaling Nagios Core at ...
 
Webinar slides: ClusterControl 1.4: The MySQL Replication & MongoDB Edition -...
Webinar slides: ClusterControl 1.4: The MySQL Replication & MongoDB Edition -...Webinar slides: ClusterControl 1.4: The MySQL Replication & MongoDB Edition -...
Webinar slides: ClusterControl 1.4: The MySQL Replication & MongoDB Edition -...
 

More from Boxed Ice

More from Boxed Ice (8)

MongoDB Tokyo - Monitoring and Queueing
MongoDB Tokyo - Monitoring and QueueingMongoDB Tokyo - Monitoring and Queueing
MongoDB Tokyo - Monitoring and Queueing
 
MongoUK 2011 - Rplacing RabbitMQ with MongoDB
MongoUK 2011 - Rplacing RabbitMQ with MongoDBMongoUK 2011 - Rplacing RabbitMQ with MongoDB
MongoUK 2011 - Rplacing RabbitMQ with MongoDB
 
MongoDB - Monitoring and queueing
MongoDB - Monitoring and queueingMongoDB - Monitoring and queueing
MongoDB - Monitoring and queueing
 
MongoDB - Monitoring & queueing
MongoDB - Monitoring & queueingMongoDB - Monitoring & queueing
MongoDB - Monitoring & queueing
 
Monitoring MongoDB (MongoUK)
Monitoring MongoDB (MongoUK)Monitoring MongoDB (MongoUK)
Monitoring MongoDB (MongoUK)
 
Monitoring MongoDB (MongoSV)
Monitoring MongoDB (MongoSV)Monitoring MongoDB (MongoSV)
Monitoring MongoDB (MongoSV)
 
MongoUK - PHP Development
MongoUK - PHP DevelopmentMongoUK - PHP Development
MongoUK - PHP Development
 
MongoUK - PHP Development
MongoUK - PHP DevelopmentMongoUK - PHP Development
MongoUK - PHP Development
 

Recently uploaded

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 

Webinar - Approaching 1 billion documents with MongoDB

  • 1. Approaching 1 Billion Documents in MongoDB David Mytton 1/25 david@boxedice.com / www.mytton.net
  • 2. Server Density Monitoring Processing Database UI 2/25 www.serverdensity.com
  • 3. db.stats() Documents 981,289,332 Collections 47,962 Indexes 39,684 Data size 369GB Index size 241GB 3/25 As of 25th Apr 2010.
  • 4. 10 months 4/25 Why we moved: http://bit.ly/mysqltomongo
  • 5. Initial Setup Replication Master Slave DC1 DC2 8GB RAM 8GB RAM 5/25
  • 6. Vertical Scaling Replication Master Slave DC1 DC2 72GB RAM 8GB RAM 6/25
  • 7. Tip #1 Keep your indexes in memory at all times. db.stats() 7/25
  • 8. Manual Partitioning Replication Master A Slave A DC1 DC2 16GB RAM 16GB RAM Replication Master B Slave B DC1 DC2 8/25 16GB RAM 16GB RAM
  • 9. Database vs collections • Many databases = many data files (small but quickly get large). • Many collections = watch namespace limit. 9/25
  • 10. Namespaces = Number of collections + number of indexes 10/25
  • 11. Tip #2 Monitor the 24,000 namespace limit. 11/25
  • 13. Console db.system.namespaces.count() 13/25
  • 14. Replica Pairs = Failover Replica Pair Master A Slave A DC1 DC2 16GB RAM 16GB RAM Replica Pair Master B Slave B DC1 DC2 14/25 16GB RAM 16GB RAM
  • 15. Tip #3 Pre-provision your oplog files. 15/25
  • 16. A shell script to generate 75GB oplog files for i in {0..40} do echo $i head -c 2146435072 /dev/zero > local.$i done 16/25
  • 17. Tip #4 Expect slower performance during initial replica sync. 17/25
  • 18. Tip #5 You can rotate your log files from the console. 18/25
  • 19. Rotating your log files db.runCommand("logRotate") 19/25
  • 20. Tip #6 Index creation blocks by default. Use background indexing if necessary. 20/25 MongoDB Manual: http://bit.ly/mongobgindex
  • 21. Tip #7 Increase your OS file descriptor limit + use persistent connections. 21/25
  • 22. Too many open files! /etc/security/limits.conf mongo hard nofile 10000 mongo soft nofile 10000 user type limit /etc/ssh/sshd_config UsePAM yes 22/25
  • 23. Space is not reused 23/25
  • 24. Tip #8 10gen commercial support is worth paying for. 24/25
  • 25. Summary 1. Keep indexes in memory. 2. Monitor the 24k namespace limit. 3. Pre-provision oplog files. 4. Expect slower performance on replica sync. 5. Rotate logs from the console. 6. Index creation blocks by default. 7. OS file descriptor limit + persistent connections. 25/25 8. Commercial support is worth it.