SlideShare uma empresa Scribd logo
1 de 30
Baixar para ler offline
Approaching 1 Billion
       Documents in MongoDB




                    David Mytton
1/30      david@boxedice.com / @davidmytton
Server Density Monitoring


       Processing       Database            UI




2/30
                    www.serverdensity.com
Cache / Data Store
                      Postback




       checksLatest              checksHistorical



3/30
db.stats()
       Documents                  937,393,315

       Collections                      27,566

       Indexes                          45,277

       Stored data                      638GB

       Inserts                    5000-8000/s


4/30
                 As of 17th Jun 2010.
13 months ago




5/30
       Why we moved: http://bit.ly/mysqltomongo
Initial Setup

                     Replication




       Master                       Slave
         DC1                         DC2
       8GB RAM                     8GB RAM

6/30
Vertical Scaling

                  Replication




        Master                   Slave
         DC1                      DC2
       72GB RAM                 8GB RAM

7/30
Tip #1

       Keep your indexes in
       memory at all times.

           db.stats()


8/30
i/o not an issue




9/30
Tip #2
         Data is flushed to disk every 60s.


        db.runCommand({fsync:1});


             --syncdelay [60]

10/30
Sharding solves
          everything




11/30
Manual Partitioning
                      Replication




           Master A                   Slave A
            DC1                        DC2
          16GB RAM                  16GB RAM


                      Replication




           Master B                   Slave B
            DC1                        DC2
12/30     16GB RAM                  16GB RAM
Sustained Traffic



                  Master                      Slave
        Avg out:       2.4Mbit/s   Avg out:       4.0Mbit/s
        Avg in:        3.8Mbit/s   Avg in:      111.2Kbit/s

13/30
Database vs collections


        • Many databases = many data files (small but
          quickly get large).
        • Many collections = watch namespace limit.

14/30
Namespaces = Number of collections +
                number of indexes




15/30
Tip #3


        Monitor the 24,000
         namespace limit.


16/30
Using Server Density




17/30
Console

        db.system.namespaces.count()




18/30
Replica Pairs = Failover
                        Replica Pair




             Master A                    Slave A
              DC1                         DC2
            16GB RAM                   16GB RAM


                        Replica Pair




             Master B                    Slave B
              DC1                         DC2
19/30       16GB RAM                   16GB RAM
Tip #4


        Pre-provision your oplog files.



20/30
A shell script to generate 75GB oplog files




          for i in {0..40}
        do echo $i
        head -c 2146435072 /dev/zero > local.$i
        done




21/30
Tip #5


        Expect slower performance
         during initial replica sync.


22/30
Tip #6


        You can rotate your log files
             from the console.


23/30
Rotating your log files

         db.runCommand("logRotate")




24/30
Tip #7

            Index creation blocks by
            default. Use background
              indexing if necessary.



25/30
        MongoDB Manual: http://bit.ly/mongobgindex
Tip #8

         Increase your OS file
         descriptor limit + use
        persistent connections.


26/30
Too many open files!
                 /etc/security/limits.conf
         mongo hard nofile 10000
         mongo soft nofile 10000
          user                 type          limit




                  /etc/ssh/sshd_config

                     UsePAM yes
27/30
Space is not reused
        Data + indexes                  551GB

        Actual disk usage               638GB


                         Fixed in
        1.1.4 1.3.x 1.5.0 1.5.1 1.5.2 1.5.3 1.5.4?

28/30
                     JIRA: SERVER-366
Summary
        1. Keep indexes in memory.
        2. Data is flushed to disk every 60s.
        3. Monitor the 24k namespace limit.
        4. Pre-provision oplog files.
        5. Expect slower performance on replica sync.
        6. Rotate logs from the console.
        7. Index creation blocks by default.
29/30   8. OS file descriptor limit + persistent connections.
Slides
          blog.boxedice.com/mongodb




                  David Mytton
30/30   david@boxedice.com / @davidmytton

Mais conteúdo relacionado

Mais procurados

深入了解Redis
深入了解Redis深入了解Redis
深入了解Redisiammutex
 
Sphinx at Craigslist in 2012
Sphinx at Craigslist in 2012Sphinx at Craigslist in 2012
Sphinx at Craigslist in 2012Jeremy Zawodny
 
分析mysql acid 设计实现
分析mysql acid 设计实现分析mysql acid 设计实现
分析mysql acid 设计实现rfyiamcool
 
No sql but even less security
No sql but even less securityNo sql but even less security
No sql but even less securityiammutex
 
Strategies for Backing Up MongoDB
Strategies for Backing Up MongoDBStrategies for Backing Up MongoDB
Strategies for Backing Up MongoDBMongoDB
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to RedisArnab Mitra
 
Oreilly Webcast 01 19 10
Oreilly Webcast 01 19 10Oreilly Webcast 01 19 10
Oreilly Webcast 01 19 10Sean Hull
 
Hypertable Berlin Buzzwords
Hypertable Berlin BuzzwordsHypertable Berlin Buzzwords
Hypertable Berlin Buzzwordshypertable
 
XtraDB 5.6 and 5.7: Key Performance Algorithms
XtraDB 5.6 and 5.7: Key Performance AlgorithmsXtraDB 5.6 and 5.7: Key Performance Algorithms
XtraDB 5.6 and 5.7: Key Performance AlgorithmsLaurynas Biveinis
 
This is redis - feature and usecase
This is redis - feature and usecaseThis is redis - feature and usecase
This is redis - feature and usecaseKris Jeong
 
Introduction to MongoDB with PHP
Introduction to MongoDB with PHPIntroduction to MongoDB with PHP
Introduction to MongoDB with PHPfwso
 
5 Tips for Getting Started with Pivotal GemFire
5 Tips for Getting Started with Pivotal GemFire5 Tips for Getting Started with Pivotal GemFire
5 Tips for Getting Started with Pivotal GemFireVMware Tanzu
 
Boosting I/O Performance with KVM io_uring
Boosting I/O Performance with KVM io_uringBoosting I/O Performance with KVM io_uring
Boosting I/O Performance with KVM io_uringShapeBlue
 
Frontera распределенный робот для обхода веба в больших объемах / Александр С...
Frontera распределенный робот для обхода веба в больших объемах / Александр С...Frontera распределенный робот для обхода веба в больших объемах / Александр С...
Frontera распределенный робот для обхода веба в больших объемах / Александр С...Ontico
 
Oreilly Webcast Jun17
Oreilly Webcast Jun17Oreilly Webcast Jun17
Oreilly Webcast Jun17Sean Hull
 
TokuDB internals / Лесин Владислав (Percona)
TokuDB internals / Лесин Владислав (Percona)TokuDB internals / Лесин Владислав (Percona)
TokuDB internals / Лесин Владислав (Percona)Ontico
 

Mais procurados (20)

深入了解Redis
深入了解Redis深入了解Redis
深入了解Redis
 
Sphinx at Craigslist in 2012
Sphinx at Craigslist in 2012Sphinx at Craigslist in 2012
Sphinx at Craigslist in 2012
 
分析mysql acid 设计实现
分析mysql acid 设计实现分析mysql acid 设计实现
分析mysql acid 设计实现
 
No sql but even less security
No sql but even less securityNo sql but even less security
No sql but even less security
 
Strategies for Backing Up MongoDB
Strategies for Backing Up MongoDBStrategies for Backing Up MongoDB
Strategies for Backing Up MongoDB
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Oreilly Webcast 01 19 10
Oreilly Webcast 01 19 10Oreilly Webcast 01 19 10
Oreilly Webcast 01 19 10
 
Hypertable Berlin Buzzwords
Hypertable Berlin BuzzwordsHypertable Berlin Buzzwords
Hypertable Berlin Buzzwords
 
MongoDB Shard Cluster
MongoDB Shard ClusterMongoDB Shard Cluster
MongoDB Shard Cluster
 
XtraDB 5.6 and 5.7: Key Performance Algorithms
XtraDB 5.6 and 5.7: Key Performance AlgorithmsXtraDB 5.6 and 5.7: Key Performance Algorithms
XtraDB 5.6 and 5.7: Key Performance Algorithms
 
This is redis - feature and usecase
This is redis - feature and usecaseThis is redis - feature and usecase
This is redis - feature and usecase
 
Introduction to MongoDB with PHP
Introduction to MongoDB with PHPIntroduction to MongoDB with PHP
Introduction to MongoDB with PHP
 
5 Tips for Getting Started with Pivotal GemFire
5 Tips for Getting Started with Pivotal GemFire5 Tips for Getting Started with Pivotal GemFire
5 Tips for Getting Started with Pivotal GemFire
 
Containers > VMs
Containers > VMsContainers > VMs
Containers > VMs
 
Boosting I/O Performance with KVM io_uring
Boosting I/O Performance with KVM io_uringBoosting I/O Performance with KVM io_uring
Boosting I/O Performance with KVM io_uring
 
Frontera распределенный робот для обхода веба в больших объемах / Александр С...
Frontera распределенный робот для обхода веба в больших объемах / Александр С...Frontera распределенный робот для обхода веба в больших объемах / Александр С...
Frontera распределенный робот для обхода веба в больших объемах / Александр С...
 
Erlang scheduler
Erlang schedulerErlang scheduler
Erlang scheduler
 
Oreilly Webcast Jun17
Oreilly Webcast Jun17Oreilly Webcast Jun17
Oreilly Webcast Jun17
 
Redis acc 2015_eng
Redis acc 2015_engRedis acc 2015_eng
Redis acc 2015_eng
 
TokuDB internals / Лесин Владислав (Percona)
TokuDB internals / Лесин Владислав (Percona)TokuDB internals / Лесин Владислав (Percona)
TokuDB internals / Лесин Владислав (Percona)
 

Semelhante a MongoUK - Approaching 1 billion documents with MongoDB1 Billion Documents

Open Source Data Deduplication
Open Source Data DeduplicationOpen Source Data Deduplication
Open Source Data DeduplicationRedWireServices
 
Some analysis of BlueStore and RocksDB
Some analysis of BlueStore and RocksDBSome analysis of BlueStore and RocksDB
Some analysis of BlueStore and RocksDBXiao Yan Li
 
Loadays MySQL
Loadays MySQLLoadays MySQL
Loadays MySQLlefredbe
 
XT Best Practices
XT Best PracticesXT Best Practices
XT Best PracticesJeff Larkin
 
High Performance Scaling Techniques in Golang Using Go Assembly
High Performance Scaling Techniques in Golang Using Go AssemblyHigh Performance Scaling Techniques in Golang Using Go Assembly
High Performance Scaling Techniques in Golang Using Go AssemblyMinio
 
Database performance tuning for SSD based storage
Database  performance tuning for SSD based storageDatabase  performance tuning for SSD based storage
Database performance tuning for SSD based storageAngelo Rajadurai
 
Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...
Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...
Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...Виталий Стародубцев
 
SSD based storage tuning for databases
SSD based storage tuning for databasesSSD based storage tuning for databases
SSD based storage tuning for databasesAngelo Rajadurai
 
Ivan mamontov and Mikhail Khuldnev Present : Fast Decompression Lucene Codec
Ivan mamontov and Mikhail Khuldnev Present : Fast Decompression Lucene CodecIvan mamontov and Mikhail Khuldnev Present : Fast Decompression Lucene Codec
Ivan mamontov and Mikhail Khuldnev Present : Fast Decompression Lucene CodecGrid Dynamics
 
Backing up Wikipedia Databases
Backing up Wikipedia DatabasesBacking up Wikipedia Databases
Backing up Wikipedia DatabasesJaime Crespo
 
Speedrunning the Open Street Map osm2pgsql Loader
Speedrunning the Open Street Map osm2pgsql LoaderSpeedrunning the Open Street Map osm2pgsql Loader
Speedrunning the Open Street Map osm2pgsql LoaderGregSmith458515
 
SDC20 ScaleFlux.pptx
SDC20 ScaleFlux.pptxSDC20 ScaleFlux.pptx
SDC20 ScaleFlux.pptxssuserabc741
 
ZFS and MySQL on Linux, the Sweet Spots
ZFS and MySQL on Linux, the Sweet SpotsZFS and MySQL on Linux, the Sweet Spots
ZFS and MySQL on Linux, the Sweet SpotsJervin Real
 
Dmx3 950-technical specifications
Dmx3 950-technical specificationsDmx3 950-technical specifications
Dmx3 950-technical specificationsRaghul P
 
The Proper Care and Feeding of MySQL Databases
The Proper Care and Feeding of MySQL DatabasesThe Proper Care and Feeding of MySQL Databases
The Proper Care and Feeding of MySQL DatabasesDave Stokes
 
Ceph Performance and Sizing Guide
Ceph Performance and Sizing GuideCeph Performance and Sizing Guide
Ceph Performance and Sizing GuideJose De La Rosa
 

Semelhante a MongoUK - Approaching 1 billion documents with MongoDB1 Billion Documents (20)

Open Source Data Deduplication
Open Source Data DeduplicationOpen Source Data Deduplication
Open Source Data Deduplication
 
Some analysis of BlueStore and RocksDB
Some analysis of BlueStore and RocksDBSome analysis of BlueStore and RocksDB
Some analysis of BlueStore and RocksDB
 
Stabilizing Ceph
Stabilizing CephStabilizing Ceph
Stabilizing Ceph
 
Loadays MySQL
Loadays MySQLLoadays MySQL
Loadays MySQL
 
XT Best Practices
XT Best PracticesXT Best Practices
XT Best Practices
 
High Performance Scaling Techniques in Golang Using Go Assembly
High Performance Scaling Techniques in Golang Using Go AssemblyHigh Performance Scaling Techniques in Golang Using Go Assembly
High Performance Scaling Techniques in Golang Using Go Assembly
 
Database performance tuning for SSD based storage
Database  performance tuning for SSD based storageDatabase  performance tuning for SSD based storage
Database performance tuning for SSD based storage
 
Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...
Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...
Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...
 
SSD based storage tuning for databases
SSD based storage tuning for databasesSSD based storage tuning for databases
SSD based storage tuning for databases
 
Lrz kurs: big data analysis
Lrz kurs: big data analysisLrz kurs: big data analysis
Lrz kurs: big data analysis
 
Ivan mamontov and Mikhail Khuldnev Present : Fast Decompression Lucene Codec
Ivan mamontov and Mikhail Khuldnev Present : Fast Decompression Lucene CodecIvan mamontov and Mikhail Khuldnev Present : Fast Decompression Lucene Codec
Ivan mamontov and Mikhail Khuldnev Present : Fast Decompression Lucene Codec
 
Backing up Wikipedia Databases
Backing up Wikipedia DatabasesBacking up Wikipedia Databases
Backing up Wikipedia Databases
 
Speedrunning the Open Street Map osm2pgsql Loader
Speedrunning the Open Street Map osm2pgsql LoaderSpeedrunning the Open Street Map osm2pgsql Loader
Speedrunning the Open Street Map osm2pgsql Loader
 
SDC20 ScaleFlux.pptx
SDC20 ScaleFlux.pptxSDC20 ScaleFlux.pptx
SDC20 ScaleFlux.pptx
 
ZFS and MySQL on Linux, the Sweet Spots
ZFS and MySQL on Linux, the Sweet SpotsZFS and MySQL on Linux, the Sweet Spots
ZFS and MySQL on Linux, the Sweet Spots
 
Dmx3 950-technical specifications
Dmx3 950-technical specificationsDmx3 950-technical specifications
Dmx3 950-technical specifications
 
The Smug Mug Tale
The Smug Mug TaleThe Smug Mug Tale
The Smug Mug Tale
 
The Proper Care and Feeding of MySQL Databases
The Proper Care and Feeding of MySQL DatabasesThe Proper Care and Feeding of MySQL Databases
The Proper Care and Feeding of MySQL Databases
 
NoSQL with MySQL
NoSQL with MySQLNoSQL with MySQL
NoSQL with MySQL
 
Ceph Performance and Sizing Guide
Ceph Performance and Sizing GuideCeph Performance and Sizing Guide
Ceph Performance and Sizing Guide
 

Mais de Boxed Ice

MongoDB Tokyo - Monitoring and Queueing
MongoDB Tokyo - Monitoring and QueueingMongoDB Tokyo - Monitoring and Queueing
MongoDB Tokyo - Monitoring and QueueingBoxed Ice
 
MongoUK 2011 - Rplacing RabbitMQ with MongoDB
MongoUK 2011 - Rplacing RabbitMQ with MongoDBMongoUK 2011 - Rplacing RabbitMQ with MongoDB
MongoUK 2011 - Rplacing RabbitMQ with MongoDBBoxed Ice
 
MongoDB - Monitoring and queueing
MongoDB - Monitoring and queueingMongoDB - Monitoring and queueing
MongoDB - Monitoring and queueingBoxed Ice
 
MongoDB - Monitoring & queueing
MongoDB - Monitoring & queueingMongoDB - Monitoring & queueing
MongoDB - Monitoring & queueingBoxed Ice
 
Monitoring MongoDB (MongoUK)
Monitoring MongoDB (MongoUK)Monitoring MongoDB (MongoUK)
Monitoring MongoDB (MongoUK)Boxed Ice
 
Monitoring MongoDB (MongoSV)
Monitoring MongoDB (MongoSV)Monitoring MongoDB (MongoSV)
Monitoring MongoDB (MongoSV)Boxed Ice
 
MongoUK - PHP Development
MongoUK - PHP DevelopmentMongoUK - PHP Development
MongoUK - PHP DevelopmentBoxed Ice
 
MongoUK - PHP Development
MongoUK - PHP DevelopmentMongoUK - PHP Development
MongoUK - PHP DevelopmentBoxed Ice
 

Mais de Boxed Ice (8)

MongoDB Tokyo - Monitoring and Queueing
MongoDB Tokyo - Monitoring and QueueingMongoDB Tokyo - Monitoring and Queueing
MongoDB Tokyo - Monitoring and Queueing
 
MongoUK 2011 - Rplacing RabbitMQ with MongoDB
MongoUK 2011 - Rplacing RabbitMQ with MongoDBMongoUK 2011 - Rplacing RabbitMQ with MongoDB
MongoUK 2011 - Rplacing RabbitMQ with MongoDB
 
MongoDB - Monitoring and queueing
MongoDB - Monitoring and queueingMongoDB - Monitoring and queueing
MongoDB - Monitoring and queueing
 
MongoDB - Monitoring & queueing
MongoDB - Monitoring & queueingMongoDB - Monitoring & queueing
MongoDB - Monitoring & queueing
 
Monitoring MongoDB (MongoUK)
Monitoring MongoDB (MongoUK)Monitoring MongoDB (MongoUK)
Monitoring MongoDB (MongoUK)
 
Monitoring MongoDB (MongoSV)
Monitoring MongoDB (MongoSV)Monitoring MongoDB (MongoSV)
Monitoring MongoDB (MongoSV)
 
MongoUK - PHP Development
MongoUK - PHP DevelopmentMongoUK - PHP Development
MongoUK - PHP Development
 
MongoUK - PHP Development
MongoUK - PHP DevelopmentMongoUK - PHP Development
MongoUK - PHP Development
 

Último

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 

Último (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 

MongoUK - Approaching 1 billion documents with MongoDB1 Billion Documents

  • 1. Approaching 1 Billion Documents in MongoDB David Mytton 1/30 david@boxedice.com / @davidmytton
  • 2. Server Density Monitoring Processing Database UI 2/30 www.serverdensity.com
  • 3. Cache / Data Store Postback checksLatest checksHistorical 3/30
  • 4. db.stats() Documents 937,393,315 Collections 27,566 Indexes 45,277 Stored data 638GB Inserts 5000-8000/s 4/30 As of 17th Jun 2010.
  • 5. 13 months ago 5/30 Why we moved: http://bit.ly/mysqltomongo
  • 6. Initial Setup Replication Master Slave DC1 DC2 8GB RAM 8GB RAM 6/30
  • 7. Vertical Scaling Replication Master Slave DC1 DC2 72GB RAM 8GB RAM 7/30
  • 8. Tip #1 Keep your indexes in memory at all times. db.stats() 8/30
  • 9. i/o not an issue 9/30
  • 10. Tip #2 Data is flushed to disk every 60s. db.runCommand({fsync:1}); --syncdelay [60] 10/30
  • 11. Sharding solves everything 11/30
  • 12. Manual Partitioning Replication Master A Slave A DC1 DC2 16GB RAM 16GB RAM Replication Master B Slave B DC1 DC2 12/30 16GB RAM 16GB RAM
  • 13. Sustained Traffic Master Slave Avg out: 2.4Mbit/s Avg out: 4.0Mbit/s Avg in: 3.8Mbit/s Avg in: 111.2Kbit/s 13/30
  • 14. Database vs collections • Many databases = many data files (small but quickly get large). • Many collections = watch namespace limit. 14/30
  • 15. Namespaces = Number of collections + number of indexes 15/30
  • 16. Tip #3 Monitor the 24,000 namespace limit. 16/30
  • 18. Console db.system.namespaces.count() 18/30
  • 19. Replica Pairs = Failover Replica Pair Master A Slave A DC1 DC2 16GB RAM 16GB RAM Replica Pair Master B Slave B DC1 DC2 19/30 16GB RAM 16GB RAM
  • 20. Tip #4 Pre-provision your oplog files. 20/30
  • 21. A shell script to generate 75GB oplog files for i in {0..40} do echo $i head -c 2146435072 /dev/zero > local.$i done 21/30
  • 22. Tip #5 Expect slower performance during initial replica sync. 22/30
  • 23. Tip #6 You can rotate your log files from the console. 23/30
  • 24. Rotating your log files db.runCommand("logRotate") 24/30
  • 25. Tip #7 Index creation blocks by default. Use background indexing if necessary. 25/30 MongoDB Manual: http://bit.ly/mongobgindex
  • 26. Tip #8 Increase your OS file descriptor limit + use persistent connections. 26/30
  • 27. Too many open files! /etc/security/limits.conf mongo hard nofile 10000 mongo soft nofile 10000 user type limit /etc/ssh/sshd_config UsePAM yes 27/30
  • 28. Space is not reused Data + indexes 551GB Actual disk usage 638GB Fixed in 1.1.4 1.3.x 1.5.0 1.5.1 1.5.2 1.5.3 1.5.4? 28/30 JIRA: SERVER-366
  • 29. Summary 1. Keep indexes in memory. 2. Data is flushed to disk every 60s. 3. Monitor the 24k namespace limit. 4. Pre-provision oplog files. 5. Expect slower performance on replica sync. 6. Rotate logs from the console. 7. Index creation blocks by default. 29/30 8. OS file descriptor limit + persistent connections.
  • 30. Slides blog.boxedice.com/mongodb David Mytton 30/30 david@boxedice.com / @davidmytton