SlideShare uma empresa Scribd logo
1 de 44
#MongoDBDays




Indexing and Query
Optimization
Chad Tindel
Senior Solution Architect, 10gen
Agenda
• What are indexes?
• Why do I need them?
• Working with indexes in MongoDB
• Optimize your queries
• Avoiding common mistakes
What are indexes?
What are indexes?
Imagine you're looking for a recipe in a cookbook
ordered by recipe name. Looking up a recipe by
name is quick and easy.
What are indexes?
• How would you find a recipe using chicken?
• How about a 250-350 calorie recipe using
 chicken?
KRISTINE TO INSERT IMAGE OF COOKBOOK




Consult the index!
1   2   3    4    5   6   7




        Linked List
1    2    3     4    5     6   7




    Finding 7 in Linked List
4


    2                       6


1          3        5           7


        Finding 7 in Tree
Indexes in MongoDB are B-trees
Queries, inserts and deletes:
       O(log(n)) time
Indexes are the single
biggest tunable
performance factor in
MongoDB
Absent or suboptimal
indexes are the most
common avoidable
MongoDB performance
problem.
Why do I need indexes?
A brief story
Working with Indexes in
MongoDB
How do I create indexes?
// Create an index if one does not exist
db.recipes.createIndex({ main_ingredient: 1 })



// The client remembers the index and raises no errors
db.recipes.ensureIndex({ main_ingredient: 1 })




* 1 means ascending, -1 descending
What can be indexed?
// Multiple fields (compound key indexes)
db.recipes.ensureIndex({
   main_ingredient: 1,
   calories: -1
})

// Arrays of values (multikey indexes)
{
   name: 'Chicken Noodle Soup’,
   ingredients : ['chicken', 'noodles']
}

db.recipes.ensureIndex({ ingredients: 1 })
What can be indexed?
// Subdocuments
{
   name : 'Apple Pie',
   contributor: {
     name: 'Joe American',
     id: 'joea123'
   }
}

db.recipes.ensureIndex({ 'contributor.id': 1 })

db.recipes.ensureIndex({ 'contributor': 1 })
How do I manage indexes?
// List a collection's indexes
db.recipes.getIndexes()
db.recipes.getIndexKeys()


// Drop a specific index
db.recipes.dropIndex({ ingredients: 1 })


// Drop all indexes and recreate them
db.recipes.reIndex()


// Default (unique) index on _id
Background Index Builds
// Index creation is a blocking operation that can take a long time
// Background creation yields to other operations
db.recipes.ensureIndex(
    { ingredients: 1 },
    { background: true }
)
Options
• Uniqueness constraints (unique, dropDups)
• Sparse Indexes
• Geospatial (2d) Indexes
• TTL Collections (expireAfterSeconds)
Uniqueness Constraints
// Only one recipe can have a given value for name
db.recipes.ensureIndex( { name: 1 }, { unique: true } )


// Force index on collection with duplicate recipe names – drop the
duplicates
db.recipes.ensureIndex(
    { name: 1 },
    { unique: true, dropDups: true }
)


* dropDups is probably never what you want
Sparse Indexes
// Only documents with field calories will be indexed
db.recipes.ensureIndex(
    { calories: -1 },
    { sparse: true }
)
// Allow multiple documents to not have calories field
db.recipes.ensureIndex(
    { name: 1 , calories: -1 },
    { unique: true, sparse: true }
)
* Missing fields are stored as null(s) in the index
Geospatial Indexes
// Add latitude, longitude coordinates
{
     name: '10gen Palo Alto’,
     loc: [ 37.449157, -122.158574 ]
}
// Index the coordinates
db.locations.ensureIndex( { loc : '2d' } )


// Query for locations 'near' a particular coordinate
db.locations.find({
     loc: { $near: [ 37.4, -122.3 ] }
})
TTL Collections
// Documents must have a BSON UTC Date field
{ 'status' : ISODate('2012-10-12T05:24:07.211Z'), … }


// Documents are removed after 'expireAfterSeconds' seconds
db.recipes.ensureIndex(
    { submitted_date: 1 },
    { expireAfterSeconds: 3600 }
)
Limitations
• Collections can not have > 64 indexes.

• Index keys can not be > 1024 bytes (1K).

• The name of an index, including the namespace, must be <
  128 characters.
• Queries can only use 1 index*

• Indexes have storage requirements, and impact the
  performance of writes.
• In memory sort (no-index) limited to 32mb of return data.
Optimize Your Queries
Profiling Slow Ops
db.setProfilingLevel( n , slowms=100ms )


n=0 profiler off
n=1 record operations longer than slowms
n=2 record all queries


db.system.profile.find()




* The profile collection is a capped collection, and fixed in size
The Explain Plan (Pre Index)
db.recipes.find( { calories:
    { $lt : 40 } }
).explain( )
{
    "cursor" : "BasicCursor" ,
    "n" : 42,
    "nscannedObjects” : 12345
    "nscanned" : 12345,
    ...
    "millis" : 356,
    ...
}
* Doesn’t use cached plans, re-evals and resets cache
The Explain Plan (Post Index)
db.recipes.find( { calories:
    { $lt : 40 } }
).explain( )
{
    "cursor" : "BtreeCursor calories_-1" ,
    "n" : 42,
    "nscannedObjects": 42
    "nscanned" : 42,
    ...
    "millis" : 0,
    ...
}
* Doesn’t use cached plans, re-evals and resets cache
The Query Optimizer
• For each "type" of query, MongoDB
  periodically tries all useful indexes
• Aborts the rest as soon as one plan wins
• The winning plan is temporarily cached for
  each “type” of query
Manually Select Index to Use
// Tell the database what index to use
db.recipes.find({
  calories: { $lt: 1000 } }
).hint({ _id: 1 })


// Tell the database to NOT use an index
db.recipes.find(
  { calories: { $lt: 1000 } }
).hint({ $natural: 1 })
Use Indexes to Sort Query
Results
// Given the following index
db.collection.ensureIndex({ a:1, b:1 , c:1, d:1 })

// The following query and sort operations can use the index
db.collection.find( ).sort({ a:1 })
db.collection.find( ).sort({ a:1, b:1 })

db.collection.find({ a:4 }).sort({ a:1, b:1 })
db.collection.find({ b:5 }).sort({ a:1, b:1 })
Indexes that won’t work for
sorting query results
// Given the following index
db.collection.ensureIndex({ a:1, b:1, c:1, d:1 })


// These can not sort using the index
db.collection.find( ).sort({ b: 1 })
db.collection.find({ b: 5 }).sort({ b: 1 })
Index Covered Queries
// MongoDB can return data from just the index
db.recipes.ensureIndex({ main_ingredient: 1, name: 1 })

// Return only the ingredients field
db.recipes.find(
   { main_ingredient: 'chicken’ },
   { _id: 0, name: 1 }
)

// indexOnly will be true in the explain plan
db.recipes.find(
    { main_ingredient: 'chicken' },
    { _id: 0, name: 1 }
).explain()
{
    "indexOnly": true,
}
Absent or suboptimal
indexes are the most
common avoidable
MongoDB performance
problem.
Avoiding Common
Mistakes
Trying to Use Multiple
Indexes
// MongoDB can only use one index for a query
db.collection.ensureIndex({ a: 1 })
db.collection.ensureIndex({ b: 1 })


// Only one of the above indexes is used
db.collection.find({ a: 3, b: 4 })
Compound Key Mistakes
// Compound key indexes are very effective
db.collection.ensureIndex({ a: 1, b: 1, c: 1 })


// But only if the query is a prefix of the index


// This query can't effectively use the index
db.collection.find({ c: 2 })


// …but this query can
db.collection.find({ a: 3, b: 5 })
Low Selectivity Indexes
db.collection.distinct('status’)
[ 'new', 'processed' ]


db.collection.ensureIndex({ status: 1 })


// Low selectivity indexes provide little benefit
db.collection.find({ status: 'new' })


// Better
db.collection.ensureIndex({ status: 1, created_at: -1 })
db.collection.find(
  { status: 'new' }
).sort({ created_at: -1 })
Regular Expressions
db.users.ensureIndex({ username: 1 })


// Left anchored regex queries can use the index
db.users.find({ username: /^joe smith/ })


// But not generic regexes
db.users.find({username: /smith/ })


// Or case insensitive queries
db.users.find({ username: /Joe/i })
Negation
// Indexes aren't helpful with negations
db.things.ensureIndex({ x: 1 })

// e.g. "not equal" queries
db.things.find({ x: { $ne: 3 } })

// …or "not in" queries
db.things.find({ x: { $nin: [2, 3, 4 ] } })

// …or the $not operator
db.people.find({ name: { $not: 'John Doe' } })
Choosing the right
indexes is one of the
most important things
you can do as a
MongoDB developer so
take the time to get your
indexes right!
#MongoDBDays




Thank you
Chad Tindel
Senior Solution Architect, 10gen

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
An Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDBAn Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDB
 
Basics of MongoDB
Basics of MongoDB Basics of MongoDB
Basics of MongoDB
 
MongoDB
MongoDBMongoDB
MongoDB
 
An introduction to MongoDB
An introduction to MongoDBAn introduction to MongoDB
An introduction to MongoDB
 
Non relational databases-no sql
Non relational databases-no sqlNon relational databases-no sql
Non relational databases-no sql
 
MongoDB and Indexes - MUG Denver - 20160329
MongoDB and Indexes - MUG Denver - 20160329MongoDB and Indexes - MUG Denver - 20160329
MongoDB and Indexes - MUG Denver - 20160329
 
MongoDB
MongoDBMongoDB
MongoDB
 
MongoDB presentation
MongoDB presentationMongoDB presentation
MongoDB presentation
 
MongoDB .local Toronto 2019: Tips and Tricks for Effective Indexing
MongoDB .local Toronto 2019: Tips and Tricks for Effective IndexingMongoDB .local Toronto 2019: Tips and Tricks for Effective Indexing
MongoDB .local Toronto 2019: Tips and Tricks for Effective Indexing
 
Mongodb basics and architecture
Mongodb basics and architectureMongodb basics and architecture
Mongodb basics and architecture
 
Introduction to Mongodb execution plan and optimizer
Introduction to Mongodb execution plan and optimizerIntroduction to Mongodb execution plan and optimizer
Introduction to Mongodb execution plan and optimizer
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
MongoDB Schema Design
MongoDB Schema DesignMongoDB Schema Design
MongoDB Schema Design
 
Intro To MongoDB
Intro To MongoDBIntro To MongoDB
Intro To MongoDB
 
MongoDB - Aggregation Pipeline
MongoDB - Aggregation PipelineMongoDB - Aggregation Pipeline
MongoDB - Aggregation Pipeline
 
Introduction to MongoDB.pptx
Introduction to MongoDB.pptxIntroduction to MongoDB.pptx
Introduction to MongoDB.pptx
 
NoSQL databases - An introduction
NoSQL databases - An introductionNoSQL databases - An introduction
NoSQL databases - An introduction
 
MongoDB .local Toronto 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pi...
MongoDB .local Toronto 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pi...MongoDB .local Toronto 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pi...
MongoDB .local Toronto 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pi...
 
MongoDB Fundamentals
MongoDB FundamentalsMongoDB Fundamentals
MongoDB Fundamentals
 

Semelhante a Indexing & Query Optimization

Indexing and Query Optimisation
Indexing and Query OptimisationIndexing and Query Optimisation
Indexing and Query Optimisation
MongoDB
 
Indexing and Query Optimization
Indexing and Query OptimizationIndexing and Query Optimization
Indexing and Query Optimization
MongoDB
 
Mongoseattle indexing-2010-07-27
Mongoseattle indexing-2010-07-27Mongoseattle indexing-2010-07-27
Mongoseattle indexing-2010-07-27
MongoDB
 
10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling
DATAVERSITY
 
Automated Slow Query Analysis: Dex the Index Robot
Automated Slow Query Analysis: Dex the Index RobotAutomated Slow Query Analysis: Dex the Index Robot
Automated Slow Query Analysis: Dex the Index Robot
MongoDB
 

Semelhante a Indexing & Query Optimization (20)

Indexing and Query Optimization
Indexing and Query OptimizationIndexing and Query Optimization
Indexing and Query Optimization
 
Indexing and Query Optimisation
Indexing and Query OptimisationIndexing and Query Optimisation
Indexing and Query Optimisation
 
Indexing and Query Optimisation
Indexing and Query OptimisationIndexing and Query Optimisation
Indexing and Query Optimisation
 
Webinar: Indexing and Query Optimization
Webinar: Indexing and Query OptimizationWebinar: Indexing and Query Optimization
Webinar: Indexing and Query Optimization
 
Indexing and Query Optimization
Indexing and Query OptimizationIndexing and Query Optimization
Indexing and Query Optimization
 
Indexing Strategies to Help You Scale
Indexing Strategies to Help You ScaleIndexing Strategies to Help You Scale
Indexing Strategies to Help You Scale
 
Mongophilly indexing-2011-04-26
Mongophilly indexing-2011-04-26Mongophilly indexing-2011-04-26
Mongophilly indexing-2011-04-26
 
Mongoseattle indexing-2010-07-27
Mongoseattle indexing-2010-07-27Mongoseattle indexing-2010-07-27
Mongoseattle indexing-2010-07-27
 
Indexing and Query Optimizer (Richard Kreuter)
Indexing and Query Optimizer (Richard Kreuter)Indexing and Query Optimizer (Richard Kreuter)
Indexing and Query Optimizer (Richard Kreuter)
 
Indexing and Query Optimizer (Mongo Austin)
Indexing and Query Optimizer (Mongo Austin)Indexing and Query Optimizer (Mongo Austin)
Indexing and Query Optimizer (Mongo Austin)
 
Indexing and Query Optimizer
Indexing and Query OptimizerIndexing and Query Optimizer
Indexing and Query Optimizer
 
unit 4,Indexes in database.docx
unit 4,Indexes in database.docxunit 4,Indexes in database.docx
unit 4,Indexes in database.docx
 
Indexing In MongoDB
Indexing In MongoDBIndexing In MongoDB
Indexing In MongoDB
 
MongoDB.local DC 2018: Tips and Tricks for Avoiding Common Query Pitfalls
MongoDB.local DC 2018: Tips and Tricks for Avoiding Common Query PitfallsMongoDB.local DC 2018: Tips and Tricks for Avoiding Common Query Pitfalls
MongoDB.local DC 2018: Tips and Tricks for Avoiding Common Query Pitfalls
 
10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling
 
Automated Slow Query Analysis: Dex the Index Robot
Automated Slow Query Analysis: Dex the Index RobotAutomated Slow Query Analysis: Dex the Index Robot
Automated Slow Query Analysis: Dex the Index Robot
 
Nosql part 2
Nosql part 2Nosql part 2
Nosql part 2
 
Indexing documents
Indexing documentsIndexing documents
Indexing documents
 
New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2
New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2
New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2
 
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial IndexesBack to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
 

Mais de MongoDB

Mais de MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Indexing & Query Optimization

  • 1. #MongoDBDays Indexing and Query Optimization Chad Tindel Senior Solution Architect, 10gen
  • 2. Agenda • What are indexes? • Why do I need them? • Working with indexes in MongoDB • Optimize your queries • Avoiding common mistakes
  • 4. What are indexes? Imagine you're looking for a recipe in a cookbook ordered by recipe name. Looking up a recipe by name is quick and easy.
  • 5. What are indexes? • How would you find a recipe using chicken? • How about a 250-350 calorie recipe using chicken?
  • 6. KRISTINE TO INSERT IMAGE OF COOKBOOK Consult the index!
  • 7. 1 2 3 4 5 6 7 Linked List
  • 8. 1 2 3 4 5 6 7 Finding 7 in Linked List
  • 9. 4 2 6 1 3 5 7 Finding 7 in Tree
  • 10. Indexes in MongoDB are B-trees
  • 11. Queries, inserts and deletes: O(log(n)) time
  • 12. Indexes are the single biggest tunable performance factor in MongoDB
  • 13. Absent or suboptimal indexes are the most common avoidable MongoDB performance problem.
  • 14. Why do I need indexes? A brief story
  • 15. Working with Indexes in MongoDB
  • 16. How do I create indexes? // Create an index if one does not exist db.recipes.createIndex({ main_ingredient: 1 }) // The client remembers the index and raises no errors db.recipes.ensureIndex({ main_ingredient: 1 }) * 1 means ascending, -1 descending
  • 17. What can be indexed? // Multiple fields (compound key indexes) db.recipes.ensureIndex({ main_ingredient: 1, calories: -1 }) // Arrays of values (multikey indexes) { name: 'Chicken Noodle Soup’, ingredients : ['chicken', 'noodles'] } db.recipes.ensureIndex({ ingredients: 1 })
  • 18. What can be indexed? // Subdocuments { name : 'Apple Pie', contributor: { name: 'Joe American', id: 'joea123' } } db.recipes.ensureIndex({ 'contributor.id': 1 }) db.recipes.ensureIndex({ 'contributor': 1 })
  • 19. How do I manage indexes? // List a collection's indexes db.recipes.getIndexes() db.recipes.getIndexKeys() // Drop a specific index db.recipes.dropIndex({ ingredients: 1 }) // Drop all indexes and recreate them db.recipes.reIndex() // Default (unique) index on _id
  • 20. Background Index Builds // Index creation is a blocking operation that can take a long time // Background creation yields to other operations db.recipes.ensureIndex( { ingredients: 1 }, { background: true } )
  • 21. Options • Uniqueness constraints (unique, dropDups) • Sparse Indexes • Geospatial (2d) Indexes • TTL Collections (expireAfterSeconds)
  • 22. Uniqueness Constraints // Only one recipe can have a given value for name db.recipes.ensureIndex( { name: 1 }, { unique: true } ) // Force index on collection with duplicate recipe names – drop the duplicates db.recipes.ensureIndex( { name: 1 }, { unique: true, dropDups: true } ) * dropDups is probably never what you want
  • 23. Sparse Indexes // Only documents with field calories will be indexed db.recipes.ensureIndex( { calories: -1 }, { sparse: true } ) // Allow multiple documents to not have calories field db.recipes.ensureIndex( { name: 1 , calories: -1 }, { unique: true, sparse: true } ) * Missing fields are stored as null(s) in the index
  • 24. Geospatial Indexes // Add latitude, longitude coordinates { name: '10gen Palo Alto’, loc: [ 37.449157, -122.158574 ] } // Index the coordinates db.locations.ensureIndex( { loc : '2d' } ) // Query for locations 'near' a particular coordinate db.locations.find({ loc: { $near: [ 37.4, -122.3 ] } })
  • 25. TTL Collections // Documents must have a BSON UTC Date field { 'status' : ISODate('2012-10-12T05:24:07.211Z'), … } // Documents are removed after 'expireAfterSeconds' seconds db.recipes.ensureIndex( { submitted_date: 1 }, { expireAfterSeconds: 3600 } )
  • 26. Limitations • Collections can not have > 64 indexes. • Index keys can not be > 1024 bytes (1K). • The name of an index, including the namespace, must be < 128 characters. • Queries can only use 1 index* • Indexes have storage requirements, and impact the performance of writes. • In memory sort (no-index) limited to 32mb of return data.
  • 28. Profiling Slow Ops db.setProfilingLevel( n , slowms=100ms ) n=0 profiler off n=1 record operations longer than slowms n=2 record all queries db.system.profile.find() * The profile collection is a capped collection, and fixed in size
  • 29. The Explain Plan (Pre Index) db.recipes.find( { calories: { $lt : 40 } } ).explain( ) { "cursor" : "BasicCursor" , "n" : 42, "nscannedObjects” : 12345 "nscanned" : 12345, ... "millis" : 356, ... } * Doesn’t use cached plans, re-evals and resets cache
  • 30. The Explain Plan (Post Index) db.recipes.find( { calories: { $lt : 40 } } ).explain( ) { "cursor" : "BtreeCursor calories_-1" , "n" : 42, "nscannedObjects": 42 "nscanned" : 42, ... "millis" : 0, ... } * Doesn’t use cached plans, re-evals and resets cache
  • 31. The Query Optimizer • For each "type" of query, MongoDB periodically tries all useful indexes • Aborts the rest as soon as one plan wins • The winning plan is temporarily cached for each “type” of query
  • 32. Manually Select Index to Use // Tell the database what index to use db.recipes.find({ calories: { $lt: 1000 } } ).hint({ _id: 1 }) // Tell the database to NOT use an index db.recipes.find( { calories: { $lt: 1000 } } ).hint({ $natural: 1 })
  • 33. Use Indexes to Sort Query Results // Given the following index db.collection.ensureIndex({ a:1, b:1 , c:1, d:1 }) // The following query and sort operations can use the index db.collection.find( ).sort({ a:1 }) db.collection.find( ).sort({ a:1, b:1 }) db.collection.find({ a:4 }).sort({ a:1, b:1 }) db.collection.find({ b:5 }).sort({ a:1, b:1 })
  • 34. Indexes that won’t work for sorting query results // Given the following index db.collection.ensureIndex({ a:1, b:1, c:1, d:1 }) // These can not sort using the index db.collection.find( ).sort({ b: 1 }) db.collection.find({ b: 5 }).sort({ b: 1 })
  • 35. Index Covered Queries // MongoDB can return data from just the index db.recipes.ensureIndex({ main_ingredient: 1, name: 1 }) // Return only the ingredients field db.recipes.find( { main_ingredient: 'chicken’ }, { _id: 0, name: 1 } ) // indexOnly will be true in the explain plan db.recipes.find( { main_ingredient: 'chicken' }, { _id: 0, name: 1 } ).explain() { "indexOnly": true, }
  • 36. Absent or suboptimal indexes are the most common avoidable MongoDB performance problem.
  • 38. Trying to Use Multiple Indexes // MongoDB can only use one index for a query db.collection.ensureIndex({ a: 1 }) db.collection.ensureIndex({ b: 1 }) // Only one of the above indexes is used db.collection.find({ a: 3, b: 4 })
  • 39. Compound Key Mistakes // Compound key indexes are very effective db.collection.ensureIndex({ a: 1, b: 1, c: 1 }) // But only if the query is a prefix of the index // This query can't effectively use the index db.collection.find({ c: 2 }) // …but this query can db.collection.find({ a: 3, b: 5 })
  • 40. Low Selectivity Indexes db.collection.distinct('status’) [ 'new', 'processed' ] db.collection.ensureIndex({ status: 1 }) // Low selectivity indexes provide little benefit db.collection.find({ status: 'new' }) // Better db.collection.ensureIndex({ status: 1, created_at: -1 }) db.collection.find( { status: 'new' } ).sort({ created_at: -1 })
  • 41. Regular Expressions db.users.ensureIndex({ username: 1 }) // Left anchored regex queries can use the index db.users.find({ username: /^joe smith/ }) // But not generic regexes db.users.find({username: /smith/ }) // Or case insensitive queries db.users.find({ username: /Joe/i })
  • 42. Negation // Indexes aren't helpful with negations db.things.ensureIndex({ x: 1 }) // e.g. "not equal" queries db.things.find({ x: { $ne: 3 } }) // …or "not in" queries db.things.find({ x: { $nin: [2, 3, 4 ] } }) // …or the $not operator db.people.find({ name: { $not: 'John Doe' } })
  • 43. Choosing the right indexes is one of the most important things you can do as a MongoDB developer so take the time to get your indexes right!
  • 44. #MongoDBDays Thank you Chad Tindel Senior Solution Architect, 10gen

Notas do Editor

  1. When speaking: What are indexes and why do we need them?First part of this talk is conceptualSecond part is extremely detailed
  2. Look at 7 documents
  3. Queries, inserts and deletes: O(log(n)) time
  4. MongoDB&apos;s indexes are B-Trees.Lookups (queries), inserts and deletes happen in O(log(n)) time.TODO: Add a page describing what a B-Tree is???
  5. So this is helpful, and can speed up queries by a tremendous amount
  6. So it’s imperative we understand them
  7. Tell a story about a customer problem caused by a missing index.
  8. Repeated calls to ensureIndex only result in one create message going to the server. The index is cached client side for some period of time (varies by driver).
  9. Indexes can be costly if you have too manysoooo....
  10. getIndexes returns an index document for each index in the collection.dropIndex requires the spec used to create the index initiallyreIndex drops *all* indexes (including the _id index) and rebuilds them
  11. Caveats:Still a resource-intensive operationIndex build is slowerThe mongo shell session or app will block while the index buildsIndexes are still built in the foreground on secondariesKristine to provide replica set image.
  12. unique applies a uniqueness constant on duplicate values.dropDups will force the server to create a unique index by only keeping the first document found in natural order with a value and dropping all other documents with that value.dropDups will likely result in data loss!!!TODO: Maybe add a red exclamation point for dropDups.
  13. MongoDB doesn&apos;t enforce a schema – documents are not required to have the same fields.Sparse indexes only contain entries for documents that have the indexed field.Without sparse, documents without field &apos;a&apos; have a null entry in the index for that field.With sparse a unique constraint can be applied to a field not shared by all documents. Otherwise multiple &apos;null&apos; values violate the unique constraint.XXX: Is there a visual that makes sense here?
  14. &apos;2d&apos; index is a geohash on top of the b-tree.Allows you to search for documents &apos;near&apos; a latitude/longitude position. Bounds queries are also possible using $within.TODO: Google maps image, or something similar. Kristine to provide.
  15. Index must be on a BSON date field.Documents are removed after expireAfterSeconds seconds.Reaper thread runs every 60 seconds.TODO: Hourglass image, or something similar. Kristine to provide.
  16. Indexes are a really powerful feature of MongoDB, however there are some limitations.Understanding these limitations is an important part of using MongoDB correctly.With the exception of $or queries.If index key exceeds 1k, documents silently dropped/not included
  17. Changingslowms also affects what queries are logged to the mongodb log file.
  18. cursor – the type of cursor used. BasicCursor means no index was used. TODO: Use a real example here instead of made up numbers…n – the number of documents that match the querynscannedObjects – the number of documents that had to be scannednscanned – the number of items (index entries or documents) examinedmillis – how long the query tookRatio of n to nscanned should be as close to 1 as possible.
  19. cursor – the type of cursor used. BasicCursor means no index was used.n – the number of documents that match the querynscannedObjects – the number of documents that had to be scannednscanned – the number of items (index entries or documents) examinedmillis – how long the query tookRatio of n to nscanned should be as close to 1 as possible.
  20. Winning plan is reevaluated after 1000 write operations (insert, update, remove, etc.).TODO: Replace much of this with an animation? Kristine to provide.
  21. Tells MongoDB exactly what index to use.
  22. MongoDB sorts results based on the field order in the index.For queries that include a sort that uses a compound key index, ensure that all fields before the first sorted field are equality matches.TODO: Better explanation
  23. MongoDB sorts results based on the field order in the index.For queries that include a sort that uses a compound key index, ensure that all fields before the first sorted field are equality matches.TODO: Better explanation
  24. TODO: Cookbook image here? Rework to go along with the cookbook example?
  25. Tell a story about a customer problem caused by a suboptimal index.TODO: Change background color?
  26. Better to use a compound index on the low selectivity field and some other more selective field.