SlideShare a Scribd company logo
1 of 22
Download to read offline
Indexing, Query Optimization, the Query
                 Optimizer — MongoAustin

                                   Mathias Stearn
                                     10gen Inc.
                                 mathias@10gen.com
                                  @mathias mongo


                                   February 15, 2011




MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
Indexing Basics




         Indexes are tree-structured sets of references to your
         documents.
         The query planner can employ indexes to efficiently enumerate
         and sort matching documents.




   MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
However, indexing strikes people as a gray art




         As is the case with relational systems, schema design and
         indexing go hand in hand. . .
         . . . but you also need to know about your actual (not just
         predicted) query patterns.




   MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
Some indexing generalities




         A collection may have at most 64 indexes.
         A query may only use 1 index (except for disjuncts of $or
         queries).
         Indexes entail additional work on inserts, updates, deletes.




   MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
Creating Indexes
   The id attribute is always indexed. Additional indexes can be
   created with ensureIndex():

      // Create an index on the user attribute
      db.collection.ensureIndex({ user : 1 })
      // Create a compound index on
      // the user and email attributes
      db.collection.ensureIndex({ user : 1, email : 1 })
      // Create an index on the tags attribute,
      // will index all values in list
      db.collection.ensureIndex({ tags : 1 })
      // Create a unique index on the user attribte
      db.collection.ensureIndex({user:1}, {unique:true})
      // Create an index in the background.
      db.collection.ensureIndex({user:1}, {background:true})

   MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
Index maintenance




   // Drops an index on x
   db.collection.dropIndex({x:1})
   // Drops all indexes except _id
   db.collection.dropIndexes()
   // Rebuild and compact indexes
   db.collection.reIndex()




   MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
Indexes are smart about data types and structures




         Indexes on attributes whose values are of different types in
         different documents can speed up queries by skipping
         documents where the relevant attribute isn’t of the
         appropriate type.
         Indexes on attributes whose values are lists will index each
         element, speeding up queries that look into these attributes.
         (You really want to do this for querying on tags.)




   MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
When can indexes be used?


   In short, if you can envision how the index might get used, it
   probably is. These will all use an index on x:
         db.collection.find( { x:                       1 } )
         db.collection.find( { x :{ $in :                           [1,2,3] } } )
         db.collection.find( { x :                        { $gt :         1 } } )
         db.collection.find( { x :                        /^a/ } )
         db.collection.count( { x :                         2 } )
         db.collection.distinct( { x :                            2 } )
         db.collection.find().sort( { x :                            1 } )




   MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
Trickier cases where indexes can be used




         db.collection.find({ x : 1 }).sort({ y : 1 })
         will use an index on y for sorting, if there’s no index on x.
         (For this sort of case, use a compound index on both x and y
         in that order.)
         db.collection.update( { x :                         2 } , { x :   3 } )
         will use and update an index on x




   MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
Some array examples



   The following queries will use an index on x, and will match
   documents whose x attribute is the array [2,10]
         db.collection.find({ x :                       2 })
         db.collection.find({ x :                       10 })
         db.collection.find({ x :                       { $gt :   5 } })
         db.collection.find({ x :                       [2,10] })
         db.collection.find({ x :                       { $in :   [2,5] }})




   MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
Geospatial indexes


   Geospatial indexes are a sort of special case; the operators that can
   take advantage of them can only be used if the relevant indexes
   have been created. Some examples:
         db.collection.find({ a : [50, 50]}) finds a
         document with this point for a.
         db.collection.find({a :                      {$near :    [50, 50]}})
         sorts results by distance.
         db.collection.find({
         a:{$within:{$box:[[40,40],[60,60]]}}}})
         db.collection.find({
         a:{$within:{$center:[[50,50],10]}}}})



   MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
When indexes cannot be used

         Many sorts of negations, e.g., $ne, $not.
         Tricky arithmetic, e.g., $mod.
         Most regular expressions (e.g., /a/).
         Expressions in $where clauses don’t take advantage of
         indexes.
                Of course $where clauses are mostly for complex queries that
                often can’t be indexed anyway, e.g., ‘‘where a > b’’. (If
                these cases matter to you, it you can precompute the match
                and store that as an additional attribute, you can store that,
                index it, and skip the $where clause entirely.)
         map/reduce can’t take advantage of indexes (mapping
         function is opaque to the query optimizer).
   As a rule, if you can’t imagine how an index might be used, it
   probably can’t!
   MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
Never forget about compound indexes




         Whenever you’re querying on multiple attributes, whether as
         part of the selector document or in a sort(), compound
         indexes can be used.




   MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
Schema/index relationships
   Sometimes, question isn’t “given the shape of these documents,
   how do I index them?”, but “how might I shape the data so I can
   take advantage of indexing?”

   // Consider a schema that uses a list of
   // attribute/value pairs:
   db.c.insert({ product : "SuperDooHickey",
                 manufacturer : "Foo Enterprises",
                 catalog : [ { stock : 50,
                               modtime: ’2010-09-02’ },
                             { price : 29.95,
                               modtime : ’2010-06-14’ } ] });
   db.c.ensureIndex({ catalog : 1 });
   // All attribute queries can use one index.
   db.c.find( { catalog : { stock : { $gt : 0 } } } )

   MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
Index sizes



   Of course, indexes take up space. For many interesting databases,
   real query performance will depend on index sizes; so it’s useful to
   see these numbers.
         db.collection.stats() shows indexSizes, the size of
         each index in the collection.
         db.stats() includes the total size of all indexes in the
         database.




   MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
explain()

   It’s useful to be able to ensure that your query is doing what you
   want it to do. For this, we have explain(). Query plans that use
   an index have cursor type BtreeCursor.

   db.collection.find({x:{$gt:5}}).explain()
   {
       "cursor" : "BtreeCursor x_1",
           ...
       "nscanned" : 100,
           ...
       "n" : 100,
       "millis" : 0,
           ...
   }


   MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
explain(), continued

   If the query plan doesn’t use the index, the cursor type will be
   BasicCursor.

   db.collection.find({x:{$gt:5}}).explain()
   {
       "cursor" : "BasicCursor",
          ...
       "nscanned" : 12345,
           ...
       "n" : 100,
       "millis" : 4,
           ...
   }


   MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
Really, compound indexes are important

   Try this at home:
      1   Create a collection with a few tens of thousands of documents
          having two attributes (let’s call them a and b).
      2   Create a compound index on {a :                     1, b :   1},
      3   Do a db.collection.find({a :                        constant}).sort({b :
          1}).explain().
      4   Note the explain result’s millis.
      5   Drop the compound index.
      6   Create another compound index with the attributes reversed.
          (This will be a suboptimal compound index.)
      7   Explain the above query again.
      8   The suboptimal index should produce a slower explain result.

   MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
The DB Profiler
  MongoDB includes a database profiler that, when enabled, records
  the timing measurements and result counts in a collection within
  the database.
  // Enable the profiler on this database.
  > db.setProfilingLevel(1, 100)
  { "was" : 0, "slowms" : 100, "ok" : 1 }
  > db.foo.find({a: { $mod : [3, 0] } });
  ...
  // See the profiler info.
  > db.system.profile.find()
  { "ts" : "Thu Nov 18 2010 06:46:16 GMT-0500 (EST)",
     "info" : "query test.$cmd ntoreturn:1
         command: { count: "foo",
                               query: { a: { $mod: [ 3.0, 0.0 ] } },
         fields: {} } reslen:64 406ms",
     "millis" : 406 }
  MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
Query Optimizer




         MongoDB’s query optimizer is empirical, not cost-based.
         To test query plans, it tries several in parallel, and records the
         plan that finishes fastest.
         If a plan’s performance changes over time (e.g., as data
         changes), the database will reoptimize (i.e., retry all possible
         plans).




   MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
Hinting the query plan




   Sometimes, you might want to force the query plan. For this, we
   have hint().

   // Force the use of an                   index on attribute x:
   db.collection.find({x:                   1, ...}).hint({x:1})
   // Force indexes to be                   avoided!
   db.collection.find({x:                   1, ...}).hint({$natural:1})




   MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
Going forward



         www.mongodb.org — downloads, docs, community
         mongodb-user@googlegroups.com — mailing list
         #mongodb on irc.freenode.net
         try.mongodb.org — web-based shell
         10gen is hiring. Email jobs@10gen.com.
         10gen offers support, training, and advising services for
         mongodb




   MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin

More Related Content

What's hot

Optimizing Slow Queries with Indexes and Creativity
Optimizing Slow Queries with Indexes and CreativityOptimizing Slow Queries with Indexes and Creativity
Optimizing Slow Queries with Indexes and Creativity
MongoDB
 
Indexing and Query Optimizer (Aaron Staple)
Indexing and Query Optimizer (Aaron Staple)Indexing and Query Optimizer (Aaron Staple)
Indexing and Query Optimizer (Aaron Staple)
MongoSF
 
MongoDB Performance Tuning
MongoDB Performance TuningMongoDB Performance Tuning
MongoDB Performance Tuning
MongoDB
 

What's hot (20)

Mythbusting: Understanding How We Measure the Performance of MongoDB
Mythbusting: Understanding How We Measure the Performance of MongoDBMythbusting: Understanding How We Measure the Performance of MongoDB
Mythbusting: Understanding How We Measure the Performance of MongoDB
 
Indexing and Performance Tuning
Indexing and Performance TuningIndexing and Performance Tuning
Indexing and Performance Tuning
 
MongoDB World 2016: Deciphering .explain() Output
MongoDB World 2016: Deciphering .explain() OutputMongoDB World 2016: Deciphering .explain() Output
MongoDB World 2016: Deciphering .explain() Output
 
Optimizing Slow Queries with Indexes and Creativity
Optimizing Slow Queries with Indexes and CreativityOptimizing Slow Queries with Indexes and Creativity
Optimizing Slow Queries with Indexes and Creativity
 
Indexing and Query Optimizer (Aaron Staple)
Indexing and Query Optimizer (Aaron Staple)Indexing and Query Optimizer (Aaron Staple)
Indexing and Query Optimizer (Aaron Staple)
 
MongoDB-SESSION03
MongoDB-SESSION03MongoDB-SESSION03
MongoDB-SESSION03
 
MongoDB Performance Tuning
MongoDB Performance TuningMongoDB Performance Tuning
MongoDB Performance Tuning
 
MongoDB Performance Tuning
MongoDB Performance TuningMongoDB Performance Tuning
MongoDB Performance Tuning
 
2011 Mongo FR - Indexing in MongoDB
2011 Mongo FR - Indexing in MongoDB2011 Mongo FR - Indexing in MongoDB
2011 Mongo FR - Indexing in MongoDB
 
Mongo indexes
Mongo indexesMongo indexes
Mongo indexes
 
MongoDB - Aggregation Pipeline
MongoDB - Aggregation PipelineMongoDB - Aggregation Pipeline
MongoDB - Aggregation Pipeline
 
Indexing and Query Optimizer
Indexing and Query OptimizerIndexing and Query Optimizer
Indexing and Query Optimizer
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
MongoDB + Java - Everything you need to know
MongoDB + Java - Everything you need to know MongoDB + Java - Everything you need to know
MongoDB + Java - Everything you need to know
 
Green dao
Green daoGreen dao
Green dao
 
GreenDao Introduction
GreenDao IntroductionGreenDao Introduction
GreenDao Introduction
 
Green dao
Green daoGreen dao
Green dao
 
greenDAO
greenDAOgreenDAO
greenDAO
 
MongoDB Aggregation
MongoDB Aggregation MongoDB Aggregation
MongoDB Aggregation
 
CouchDB-Lucene
CouchDB-LuceneCouchDB-Lucene
CouchDB-Lucene
 

Viewers also liked

Viewers also liked (6)

Fun Teaching MongoDB New Tricks
Fun Teaching MongoDB New TricksFun Teaching MongoDB New Tricks
Fun Teaching MongoDB New Tricks
 
MongoDB's index and query optimize
MongoDB's index and query optimizeMongoDB's index and query optimize
MongoDB's index and query optimize
 
Indexing and Query Optimization
Indexing and Query OptimizationIndexing and Query Optimization
Indexing and Query Optimization
 
Indexing and Query Optimizer (Richard Kreuter)
Indexing and Query Optimizer (Richard Kreuter)Indexing and Query Optimizer (Richard Kreuter)
Indexing and Query Optimizer (Richard Kreuter)
 
Trent McConaghy- BigchainDB
Trent McConaghy- BigchainDBTrent McConaghy- BigchainDB
Trent McConaghy- BigchainDB
 
Intro To MongoDB
Intro To MongoDBIntro To MongoDB
Intro To MongoDB
 

Similar to Indexing and Query Optimizer (Mongo Austin)

Mongoseattle indexing-2010-07-27
Mongoseattle indexing-2010-07-27Mongoseattle indexing-2010-07-27
Mongoseattle indexing-2010-07-27
MongoDB
 
Indexing and Query Optimisation
Indexing and Query OptimisationIndexing and Query Optimisation
Indexing and Query Optimisation
MongoDB
 
SH 2 - SES 3 - MongoDB Aggregation Framework.pptx
SH 2 - SES 3 -  MongoDB Aggregation Framework.pptxSH 2 - SES 3 -  MongoDB Aggregation Framework.pptx
SH 2 - SES 3 - MongoDB Aggregation Framework.pptx
MongoDB
 
1403 app dev series - session 5 - analytics
1403   app dev series - session 5 - analytics1403   app dev series - session 5 - analytics
1403 app dev series - session 5 - analytics
MongoDB
 

Similar to Indexing and Query Optimizer (Mongo Austin) (20)

Mongophilly indexing-2011-04-26
Mongophilly indexing-2011-04-26Mongophilly indexing-2011-04-26
Mongophilly indexing-2011-04-26
 
Mongoseattle indexing-2010-07-27
Mongoseattle indexing-2010-07-27Mongoseattle indexing-2010-07-27
Mongoseattle indexing-2010-07-27
 
unit 4,Indexes in database.docx
unit 4,Indexes in database.docxunit 4,Indexes in database.docx
unit 4,Indexes in database.docx
 
Query Optimization in MongoDB
Query Optimization in MongoDBQuery Optimization in MongoDB
Query Optimization in MongoDB
 
Overview on NoSQL and MongoDB
Overview on NoSQL and MongoDBOverview on NoSQL and MongoDB
Overview on NoSQL and MongoDB
 
Indexing In MongoDB
Indexing In MongoDBIndexing In MongoDB
Indexing In MongoDB
 
Indexing documents
Indexing documentsIndexing documents
Indexing documents
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Mongo db a deep dive of mongodb indexes
Mongo db  a deep dive of mongodb indexesMongo db  a deep dive of mongodb indexes
Mongo db a deep dive of mongodb indexes
 
Nosql part 2
Nosql part 2Nosql part 2
Nosql part 2
 
Indexing and Query Optimisation
Indexing and Query OptimisationIndexing and Query Optimisation
Indexing and Query Optimisation
 
Mongo Nosql CRUD Operations
Mongo Nosql CRUD OperationsMongo Nosql CRUD Operations
Mongo Nosql CRUD Operations
 
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & AggregationWebinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
 
Mongo Performance Optimization Using Indexing
Mongo Performance Optimization Using IndexingMongo Performance Optimization Using Indexing
Mongo Performance Optimization Using Indexing
 
Webinar: Indexing and Query Optimization
Webinar: Indexing and Query OptimizationWebinar: Indexing and Query Optimization
Webinar: Indexing and Query Optimization
 
SH 2 - SES 3 - MongoDB Aggregation Framework.pptx
SH 2 - SES 3 -  MongoDB Aggregation Framework.pptxSH 2 - SES 3 -  MongoDB Aggregation Framework.pptx
SH 2 - SES 3 - MongoDB Aggregation Framework.pptx
 
Elasticsearch first-steps
Elasticsearch first-stepsElasticsearch first-steps
Elasticsearch first-steps
 
1403 app dev series - session 5 - analytics
1403   app dev series - session 5 - analytics1403   app dev series - session 5 - analytics
1403 app dev series - session 5 - analytics
 
MongoDB using Grails plugin by puneet behl
MongoDB using Grails plugin by puneet behlMongoDB using Grails plugin by puneet behl
MongoDB using Grails plugin by puneet behl
 
Mongo db queries
Mongo db queriesMongo db queries
Mongo db queries
 

More from MongoDB

More from MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Recently uploaded

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 

Indexing and Query Optimizer (Mongo Austin)

  • 1. Indexing, Query Optimization, the Query Optimizer — MongoAustin Mathias Stearn 10gen Inc. mathias@10gen.com @mathias mongo February 15, 2011 MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  • 2. Indexing Basics Indexes are tree-structured sets of references to your documents. The query planner can employ indexes to efficiently enumerate and sort matching documents. MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  • 3. However, indexing strikes people as a gray art As is the case with relational systems, schema design and indexing go hand in hand. . . . . . but you also need to know about your actual (not just predicted) query patterns. MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  • 4. Some indexing generalities A collection may have at most 64 indexes. A query may only use 1 index (except for disjuncts of $or queries). Indexes entail additional work on inserts, updates, deletes. MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  • 5. Creating Indexes The id attribute is always indexed. Additional indexes can be created with ensureIndex(): // Create an index on the user attribute db.collection.ensureIndex({ user : 1 }) // Create a compound index on // the user and email attributes db.collection.ensureIndex({ user : 1, email : 1 }) // Create an index on the tags attribute, // will index all values in list db.collection.ensureIndex({ tags : 1 }) // Create a unique index on the user attribte db.collection.ensureIndex({user:1}, {unique:true}) // Create an index in the background. db.collection.ensureIndex({user:1}, {background:true}) MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  • 6. Index maintenance // Drops an index on x db.collection.dropIndex({x:1}) // Drops all indexes except _id db.collection.dropIndexes() // Rebuild and compact indexes db.collection.reIndex() MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  • 7. Indexes are smart about data types and structures Indexes on attributes whose values are of different types in different documents can speed up queries by skipping documents where the relevant attribute isn’t of the appropriate type. Indexes on attributes whose values are lists will index each element, speeding up queries that look into these attributes. (You really want to do this for querying on tags.) MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  • 8. When can indexes be used? In short, if you can envision how the index might get used, it probably is. These will all use an index on x: db.collection.find( { x: 1 } ) db.collection.find( { x :{ $in : [1,2,3] } } ) db.collection.find( { x : { $gt : 1 } } ) db.collection.find( { x : /^a/ } ) db.collection.count( { x : 2 } ) db.collection.distinct( { x : 2 } ) db.collection.find().sort( { x : 1 } ) MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  • 9. Trickier cases where indexes can be used db.collection.find({ x : 1 }).sort({ y : 1 }) will use an index on y for sorting, if there’s no index on x. (For this sort of case, use a compound index on both x and y in that order.) db.collection.update( { x : 2 } , { x : 3 } ) will use and update an index on x MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  • 10. Some array examples The following queries will use an index on x, and will match documents whose x attribute is the array [2,10] db.collection.find({ x : 2 }) db.collection.find({ x : 10 }) db.collection.find({ x : { $gt : 5 } }) db.collection.find({ x : [2,10] }) db.collection.find({ x : { $in : [2,5] }}) MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  • 11. Geospatial indexes Geospatial indexes are a sort of special case; the operators that can take advantage of them can only be used if the relevant indexes have been created. Some examples: db.collection.find({ a : [50, 50]}) finds a document with this point for a. db.collection.find({a : {$near : [50, 50]}}) sorts results by distance. db.collection.find({ a:{$within:{$box:[[40,40],[60,60]]}}}}) db.collection.find({ a:{$within:{$center:[[50,50],10]}}}}) MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  • 12. When indexes cannot be used Many sorts of negations, e.g., $ne, $not. Tricky arithmetic, e.g., $mod. Most regular expressions (e.g., /a/). Expressions in $where clauses don’t take advantage of indexes. Of course $where clauses are mostly for complex queries that often can’t be indexed anyway, e.g., ‘‘where a > b’’. (If these cases matter to you, it you can precompute the match and store that as an additional attribute, you can store that, index it, and skip the $where clause entirely.) map/reduce can’t take advantage of indexes (mapping function is opaque to the query optimizer). As a rule, if you can’t imagine how an index might be used, it probably can’t! MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  • 13. Never forget about compound indexes Whenever you’re querying on multiple attributes, whether as part of the selector document or in a sort(), compound indexes can be used. MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  • 14. Schema/index relationships Sometimes, question isn’t “given the shape of these documents, how do I index them?”, but “how might I shape the data so I can take advantage of indexing?” // Consider a schema that uses a list of // attribute/value pairs: db.c.insert({ product : "SuperDooHickey", manufacturer : "Foo Enterprises", catalog : [ { stock : 50, modtime: ’2010-09-02’ }, { price : 29.95, modtime : ’2010-06-14’ } ] }); db.c.ensureIndex({ catalog : 1 }); // All attribute queries can use one index. db.c.find( { catalog : { stock : { $gt : 0 } } } ) MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  • 15. Index sizes Of course, indexes take up space. For many interesting databases, real query performance will depend on index sizes; so it’s useful to see these numbers. db.collection.stats() shows indexSizes, the size of each index in the collection. db.stats() includes the total size of all indexes in the database. MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  • 16. explain() It’s useful to be able to ensure that your query is doing what you want it to do. For this, we have explain(). Query plans that use an index have cursor type BtreeCursor. db.collection.find({x:{$gt:5}}).explain() { "cursor" : "BtreeCursor x_1", ... "nscanned" : 100, ... "n" : 100, "millis" : 0, ... } MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  • 17. explain(), continued If the query plan doesn’t use the index, the cursor type will be BasicCursor. db.collection.find({x:{$gt:5}}).explain() { "cursor" : "BasicCursor", ... "nscanned" : 12345, ... "n" : 100, "millis" : 4, ... } MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  • 18. Really, compound indexes are important Try this at home: 1 Create a collection with a few tens of thousands of documents having two attributes (let’s call them a and b). 2 Create a compound index on {a : 1, b : 1}, 3 Do a db.collection.find({a : constant}).sort({b : 1}).explain(). 4 Note the explain result’s millis. 5 Drop the compound index. 6 Create another compound index with the attributes reversed. (This will be a suboptimal compound index.) 7 Explain the above query again. 8 The suboptimal index should produce a slower explain result. MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  • 19. The DB Profiler MongoDB includes a database profiler that, when enabled, records the timing measurements and result counts in a collection within the database. // Enable the profiler on this database. > db.setProfilingLevel(1, 100) { "was" : 0, "slowms" : 100, "ok" : 1 } > db.foo.find({a: { $mod : [3, 0] } }); ... // See the profiler info. > db.system.profile.find() { "ts" : "Thu Nov 18 2010 06:46:16 GMT-0500 (EST)", "info" : "query test.$cmd ntoreturn:1 command: { count: "foo", query: { a: { $mod: [ 3.0, 0.0 ] } }, fields: {} } reslen:64 406ms", "millis" : 406 } MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  • 20. Query Optimizer MongoDB’s query optimizer is empirical, not cost-based. To test query plans, it tries several in parallel, and records the plan that finishes fastest. If a plan’s performance changes over time (e.g., as data changes), the database will reoptimize (i.e., retry all possible plans). MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  • 21. Hinting the query plan Sometimes, you might want to force the query plan. For this, we have hint(). // Force the use of an index on attribute x: db.collection.find({x: 1, ...}).hint({x:1}) // Force indexes to be avoided! db.collection.find({x: 1, ...}).hint({$natural:1}) MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  • 22. Going forward www.mongodb.org — downloads, docs, community mongodb-user@googlegroups.com — mailing list #mongodb on irc.freenode.net try.mongodb.org — web-based shell 10gen is hiring. Email jobs@10gen.com. 10gen offers support, training, and advising services for mongodb MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin