SlideShare uma empresa Scribd logo
1 de 133
Baixar para ler offline
Avoiding Query Pitfalls
Chris Harris
Technical Services Engineer, MongoDB
Roadmap
Motivation
Who am I?
Roadmap
Motivation
Who am I?
Three Items to be aware of:
Blocking Stages
Using the $or operator
Case-insensitivity
The Power of Query Optimization
Query tuning results in:
Improved performance
Reduced resource utilization
This may lead to:
Improved stability and predictability
A smaller hardware footprint
Not uncommon to observe efficiency improvements greater than 99%
About Me
Technical Services Engineer (Support)
2.5 year tenure
Member of the Technical Experts program
Focus: Queries and Indexing
Previously: Data Warehouse workload optimization
About Me
Technical Services Engineer (Support)
2.5 year tenure
Member of the Technical Experts program
Focus: Queries and Indexing
Previously: Data Warehouse workload optimization
Meet Asya
DBA at Acme Game, Inc.
MongoDB Champion
Meet Stakeholders
Others at Acme, Inc.
Developers
Leadership
RDBMS Historically
Acme Games Introduces...
ShortFite!
Brand new Battle Royale game
Launching July 1st
Stakeholder Concerns
Game nearly complete
Developers have learned a lot from Asya
Stakeholder Concerns
Indexes support the efficient
execution of queries in MongoDB
Game nearly complete
Developers have learned a lot from Asya
Stakeholder Concerns
Game nearly complete
Developers have learned a lot from Asya
Indexes support the efficient
execution of queries in MongoDB
Stakeholder Concerns
Game nearly complete
Developers have learned a lot from Asya
Indexes support the efficient
execution of queries in MongoDB
Ace Sue
… …Bob
Stakeholder Concerns
Game nearly complete
Developers have learned a lot from Asya
App being stress tested
Stakeholder Concerns
Game nearly complete
Developers have learned a lot from Asya
App being stress tested
Concerns over current performance
Stakeholder Concern #1
Developers created index
db.games.createIndex({ gamerTag: 1 })
This query takes several seconds to execute:
db.games.find( { gamerTag: "Ace" } ).sort({score:-1})
Adding the index on score does not help!
db.games.createIndex({ score: -1 })
Developers created index
db.games.createIndex({ gamerTag: 1 })
This query takes several seconds to execute:
db.games.find( { gamerTag: "Ace" } ).sort({score:-1})
Adding the index on score does not help!
db.games.createIndex({ score: -1 })
“Clearly MongoDB
is not webscale!”
Stakeholder Concern #1
Blocking Operations
Blocking Operation
Formally:
“An operation which must process all input before it can begin to produce any output.”
Opposite of the often desirable “fully pipelined” plan which can stream results back as
soon as they are found.
Commonly observed when a sort is added to a query
Sorting with blocking
Sorting with blocking
Sorting with blocking
Sorting with blocking
Sorting with blocking
Sorting with blocking
Sorting with blocking
Sorting with blocking
Sorting with blocking
Sorting with blocking
Sorting with blocking
Sorting with blocking
Sorting with blocking
Sorting with blocking
Sorting with blocking
Sorting with blocking
Sorting with blocking
Sorting with blocking
Sorting with blocking
Sorting with blocking
Sorting with blocking
Sorting with blocking
Sorting with blocking
Sorting with blocking
Sorting without blocking
Sorting without blocking
Sorting without blocking
Sorting without blocking
Blocking Stages
$sort
In aggregation and find
$group
$bucket
$count
$facet
Are there any other blocking
operations?
Working with blocking stages
For sorting:
Add a supporting index
Worth the overhead in almost all circumstances
For other stages:
Do you need the blocking stage?
Offload to secondary member
Stakeholder Concern #1
Performance of
db.games.find( { gamerTag: "Ace" } ).sort({score:-1})
“Clearly MongoDB is not webscale!”
Stakeholder Concern #1
Performance of
db.games.find( { gamerTag: "Ace" } ).sort({score:-1})
db.games.createIndex({ gamerTag: 1, score:-1 })
Stakeholder Concern #1
Performance of
db.games.find( { gamerTag: "Ace" } ).sort({score:-1})
db.games.createIndex({ gamerTag: 1, score:-1 })
"That’ll work great!”
Stakeholder Concern #2
The $and version of a query
returns quickly:
db.games.find({
$and : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
But the $or version is slow:
db.games.find({
$or : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
Stakeholder Concern #2
The $and version of a query
returns quickly:
db.games.find({
$and : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
But the $or version is slow:
db.games.find({
$or : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
We just created an index with both
those fields… Can it be used?
$or
$and example
Query on games:
db.games.find({
$and : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
Matching games:
{ gamerTag: "Ace", score: 9500 }
Non-matching games:
{ gamerTag: "Ace", score: 500 },
{ gamerTag: "Bob", score: 9500 },
{ gamerTag: "Bob", score: 500 }
Groups of documents
score: {$gt: 9000}gamerTag: "Ace"
{ gamerTag: "Ace",
score: 9500 }
{ gamerTag: "Ace",
score: 500 }
{ gamerTag: "Bob",
score: 9500 }
{ gamerTag: "Bob",
score: 500 }
gamerTag: "Ace"
$and Venn Diagram (logical)
score: {$gt: 9000}
{ gamerTag: "Ace",
score: 9500 }
{ gamerTag: "Ace",
score: 500 }
{ gamerTag: "Bob",
score: 9500 }
{ gamerTag: "Bob",
score: 500 }
db.games.find({
$and : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
$and Venn Diagram (logical)
{ gamerTag: "Bob",
score: 500 }
gamerTag: "Ace"
{ gamerTag: "Ace",
score: 500 }
db.games.find({
$and : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
gamerTag: "Ace"
$and Venn Diagram (logical)
score: {$gt: 9000}
db.games.find({
$and : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
gamerTag: "Ace"
$and Venn Diagram (logical)
score: {$gt: 9000}
db.games.find({
$and : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
$and Venn Diagram (logical)
score: {$gt: 9000}gamerTag: "Ace"
db.games.find({
$and : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
$and Venn Diagram (logical)
score: {$gt: 9000}gamerTag: "Ace"
db.games.find({
$and : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
$and Venn Diagram (logical)
score: {$gt: 9000}gamerTag: "Ace"
db.games.find({
$and : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
Ace Bob
{gamerTag:1
, score:-1}
500 9500
500 9500
db.games.find({
$and : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
$and Index Visualization
Ace Bob
{gamerTag:1
, score:-1}
500 9500
500 9500
db.games.find({
$and : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
"indexBounds" : {
"gamerTag" : [
"["Ace", "Ace"]"
],
"score" : [
"[inf.0, 9000.0)"
]
}
$and Index Visualization
Ace Bob
500 9500
500 9500
{gamerTag:1
, score:-1}
db.games.find({
$and : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
"indexBounds" : {
"gamerTag" : [
"["Ace", "Ace"]"
],
"score" : [
"[inf.0, 9000.0)"
]
}
$and Index Visualization
Bob
500 9500
500 9500
Ace
db.games.find({
$and : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
"indexBounds" : {
"gamerTag" : [
"["Ace", "Ace"]"
],
"score" : [
"[inf.0, 9000.0)"
]
}
{gamerTag:1
, score:-1}
$and Index Visualization
$and Index Visualization
Bob
500
500 9500
"indexBounds" : {
"gamerTag" : [
"["Ace", "Ace"]"
],
"score" : [
"[inf.0, 9000.0)"
]
}
Ace
9500
db.games.find({
$and : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
{gamerTag:1
, score:-1}
$or example
Query on games:
db.games.find({
$or : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
Matching games:
{ gamerTag: "Ace", score: 9500 },
{ gamerTag: "Ace", score: 500 },
{ gamerTag: "Bob", score: 9500 }
Non-matching games:
{ gamerTag: "Bob", score: 500 }
gamerTag: "Ace"
$or Venn Diagram (logical)
score: {$gt: 9000}
{ gamerTag: "Ace",
score: 9500 }
{ gamerTag: "Ace",
score: 500 }
{ gamerTag: "Bob",
score: 9500 }
{ gamerTag: "Bob",
score: 500 }
db.games.find({
$or : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
$or Venn Diagram (logical)
score: {$gt: 9000}
{ gamerTag: "Ace",
score: 9500 }
{ gamerTag: "Ace",
score: 500 }
{ gamerTag: "Bob",
score: 9500 }
{ gamerTag: "Bob",
score: 500 }
db.games.find({
$or : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
gamerTag: "Ace"
$or Venn Diagram (logical)
score: {$gt: 9000}
{ gamerTag: "Ace",
score: 9500 }
{ gamerTag: "Ace",
score: 500 }
{ gamerTag: "Bob",
score: 9500 }
{ gamerTag: "Bob",
score: 500 }
db.games.find({
$or : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
gamerTag: "Ace"
score: {$gt: 9000}
$or Venn Diagram (logical)
{ gamerTag: "Ace",
score: 9500 }
{ gamerTag: "Bob",
score: 500 }
db.games.find({
$or : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
{ gamerTag: "Ace",
score: 500 }
gamerTag: "Ace"
{ gamerTag: "Bob",
score: 9500 }
score: {$gt: 9000}gamerTag: "Ace"
$or Venn Diagram (logical)
{ gamerTag: "Ace",
score: 9500 }
{ gamerTag: "Ace",
score: 500 }
{ gamerTag: "Bob",
score: 9500 }
{ gamerTag: "Bob",
score: 500 }
db.games.find({
$or : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
score: {$gt: 9000}gamerTag: "Ace"
$or Venn Diagram (logical)
db.games.find({
$or : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
score: {$gt: 9000}gamerTag: "Ace"
$or Venn Diagram (logical)
db.games.find({
$or : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
score: {$gt: 9000}gamerTag: "Ace"
$or Venn Diagram (logical)
db.games.find({
$or : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
score: {$gt: 9000}
$or Venn Diagram (logical)
db.games.find({
$or : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
gamerTag: "Ace"
score: {$gt: 9000}
$or Venn Diagram (logical)
db.games.find({
$or : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
gamerTag: "Ace"
$or (single) Index visualization
Ace Bob
{gamerTag:1
, score:-1}
500 9500
500 9500
db.games.find({
$or : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
Ace Bob
{gamerTag:1
, score:-1}
500 9500
500 9500
Expected Index Bounds:
"indexBounds" : {
"gamerTag" : [
"["Ace", "Ace"]"
],
"score" : [
"[inf.0, 9000]"
]
}
$or (single) Index visualization
db.games.find({
$or : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
Ace Bob
500 9500
500 9500
Expected Index Bounds:
"indexBounds" : {
"gamerTag" : [
"["Ace", "Ace"]"
],
"score" : [
"[inf.0, 9000]"
]
}
{gamerTag:1
, score:-1}
$or (single) Index visualization
db.games.find({
$or : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
Bob
500 9500
500 9500
{gamerTag:1
, score:-1}
Ace
$or (single) Index visualization
db.games.find({
$or : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
Expected Index Bounds:
"indexBounds" : {
"gamerTag" : [
"["Ace", "Ace"]"
],
"score" : [
"[inf.0, 9000]"
]
}
Bob
500 9500
{gamerTag:1
, score:-1}
Ace
500 9500
$or (single) Index visualization
db.games.find({
$or : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
Expected Index Bounds:
"indexBounds" : {
"gamerTag" : [
"["Ace", "Ace"]"
],
"score" : [
"[inf.0, 9000]"
]
}
500
Bob
9500
$or (single) Index visualization
db.games.find({
$or : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
{gamerTag:1
, score:-1}
Ace
500 9500
Expected Index Bounds:
"indexBounds" : {
"gamerTag" : [
"["Ace", "Ace"]"
],
"score" : [
"[inf.0, 9000]"
]
}
500
Bob
9500
Expected Index Bounds:
"indexBounds" : {
"gamerTag" : [
"["Ace", "Ace"]"
],
"score" : [
"[inf.0, 9000]"
]
}
Actual (Hinted) Index Bounds:
"indexBounds" : {
"gamerTag" : [
"[MinKey, MaxKey]"
],
"score" : [
"[MaxKey, MinKey]"
]
}
$or (single) Index visualization
db.games.find({
$or : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
{gamerTag:1
, score:-1}
Ace
500 9500
Expected Index Bounds:
"indexBounds" : {
"gamerTag" : [
"["Ace", "Ace"]"
],
"score" : [
"[inf.0, 9000]"
]
}
Actual (Hinted) Index Bounds:
"indexBounds" : {
"gamerTag" : [
"[MinKey, MaxKey]"
],
"score" : [
"[MaxKey, MinKey]"
]
}
Ace Bob
500 9500
500 9500
{gamerTag:1
, score:-1}
So is there anything we can do to
improve the performance of this query?
$or (single) Index visualization
db.games.find({
$or : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
Recommendations
Use multiple indexes!
db.data.createIndex({gamerTag: 1})
db.data.createIndex({score: 1})
$or (multiple) Index visualization
Ace Bob
{gamerTag:1
, score:-1}
500 9500
500 9500
db.games.find({
$or : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
$or (multiple) Index visualization
db.games.find({
$or : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
$or (multiple) Index visualization
Ace
db.games.find({
$or : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
$or (multiple) Index visualization
Ace Bob
db.games.find({
$or : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
$or (multiple) Index visualization
Ace Bob
{gamerTag:1}
db.games.find({
$or : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
$or (multiple) Index visualization
Ace Bob
{gamerTag:1}
500
db.games.find({
$or : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
$or (multiple) Index visualization
Ace Bob
{gamerTag:1}
500 9500
db.games.find({
$or : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
$or (multiple) Index visualization
Ace Bob
{gamerTag:1}
500 9500
{score:1}
db.games.find({
$or : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
$or (multiple) Index visualization
Ace Bob
{gamerTag:1}
500 9500
{score:1}
db.games.find({
$or : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
"indexBounds" : {
"gamerTag" : [
"["Ace", "Ace"]"
]
}
"indexBounds" : {
"score" : [
"(9000.0, inf.0]"
]
}
$or (multiple) Index visualization
Ace Bob 500 9500
{score:1}{gamerTag:1}
db.games.find({
$or : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
"indexBounds" : {
"gamerTag" : [
"["Ace", "Ace"]"
]
}
"indexBounds" : {
"score" : [
"(9000.0, inf.0]"
]
}
$or (multiple) Index visualization
Bob 500 9500
{score:1}
Ace
db.games.find({
$or : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
"indexBounds" : {
"gamerTag" : [
"["Ace", "Ace"]"
]
}
"indexBounds" : {
"score" : [
"(9000.0, inf.0]"
]
}
{gamerTag:1}
$or (multiple) Index visualization
Bob 500 9500
{score:1}
db.games.find({
$or : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
"indexBounds" : {
"gamerTag" : [
"["Ace", "Ace"]"
]
}
"indexBounds" : {
"score" : [
"(9000.0, inf.0]"
]
}
{gamerTag:1}
Ace
$or (multiple) Index visualization
Bob 500 9500
db.games.find({
$or : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
"indexBounds" : {
"gamerTag" : [
"["Ace", "Ace"]"
]
}
"indexBounds" : {
"score" : [
"(9000.0, inf.0]"
]
}
{gamerTag:1}
Ace
{score:1}
$or (multiple) Index visualization
Bob 500
"indexBounds" : {
"gamerTag" : [
"["Ace", "Ace"]"
]
}
"indexBounds" : {
"score" : [
"(9000.0, inf.0]"
]
}
9500
db.games.find({
$or : [
{ gamerTag: "Ace" },
{ score: {$gt: 9000} }
]
})
{gamerTag:1}
Ace
{score:1}
Recommendations
Use multiple indexes!
db.data.createIndex({gamerTag: 1})
db.data.createIndex({score: 1})
Recommendations
We already have the {gamerTag:1, score:-1}
index, do we need both of these new ones?
Use multiple indexes!
db.data.createIndex({gamerTag: 1})
db.data.createIndex({score: 1})
Recommendations
We already have the {gamerTag:1, score:-1}
index, do we need both of these new ones?
Use multiple indexes!
db.data.createIndex({gamerTag: 1})
db.data.createIndex({score: 1})
Recommendations
Use multiple indexes!
db.data.createIndex({gamerTag: 1})
db.data.createIndex({score: 1})
Works with sorting
Generate a SORT_MERGE plan
db.games.find({
$or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ]
})
Having the right index is critical
Stakeholder Concern #2
Stakeholder Concern #2
db.games.find({
$or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ]
})
Having the right index is critical
db.games.find({
$or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ]
})
Having the right index is critical
"Super!!”
Stakeholder Concern #2
“Wait wait wait, we can’t even FIND the gamers!”
A basic search on gamerTag takes several seconds already:
db.games.find({gamerTag: /^Ace$/i})
“This query is SLOWER with the index than it is without it!”
Stakeholder Concern #3
Case Insensitive
Matching games:
{ gamerTag: "Ace", score: 9500 }
Non-matching games:
{ gamerTag: "ACE", score: 500 },
{ gamerTag: "aCe", score: 9500 },
{ gamerTag: "ace", score: 0 },
{ gamerTag: "Bob", score: 500 },
{ gamerTag: "acxyz", score: 9500 },
{ gamerTag: "Ace mdb", score: 9500 }
db.games.find({
gamerTag: /^Ace$/
})
//equivalent to
db.games.find({
gamerTag: “Ace”
})
Case Sensitive
Case Sensitive
ace aCe acxyz Ace
Ace
mdb
ACE Bob
"indexBounds" : {
"gamerTag" : [
"["Ace", "Acf")",
"[/^Ace$/, /^Ace$/]"
]
}
Matching games:
{ gamerTag: "Ace", score: 9500 }
Non-matching games:
{ gamerTag: "ACE", score: 500 },
{ gamerTag: "aCe", score: 9500 },
{ gamerTag: "ace", score: 0 },
{ gamerTag: "Bob", score: 500 },
{ gamerTag: "acxyz", score: 9500 },
{ gamerTag: "Ace mdb", score: 9500 }
Matching games:
{ gamerTag: "Ace", score: 9500 },
{ gamerTag: "ACE", score: 500 },
{ gamerTag: "aCe", score: 9500 },
{ gamerTag: "ace", score: 0 }
Non-matching games:
{ gamerTag: "Bob", score: 500 },
{ gamerTag: "acxyz", score: 9500 },
{ gamerTag: "Ace mdb", score: 9500 }
Case Insensitive
db.games.find({
gamerTag: /^Ace$/i
})
//equivalent to
db.games.find({
gamerTag: {
$regex: “^Ace$”,
$options: “i”
}
})
//equivalent to
db.games.find({ gamerTag: “Ace”})
.collation({locale:’en’,
strength:2})
Case Insensitive
db.games.find({
gamerTag: /^Ace$/i
})
//equivalent to
db.games.find({
gamerTag: {
$regex: “^Ace$”,
$options: “i”
}
})
//equivalent to
db.games.find({ gamerTag: “Ace”})
.collation({locale:’en’,
strength:2})
Would a $text search be the same as
well?
Matching games:
{ gamerTag: "Ace", score: 9500 },
{ gamerTag: "ACE", score: 500 },
{ gamerTag: "aCe", score: 9500 },
{ gamerTag: "ace", score: 0 }
Non-matching games:
{ gamerTag: "Bob", score: 500 },
{ gamerTag: "acxyz", score: 9500 },
{ gamerTag: "Ace mdb", score: 9500 }
Case Insensitive
ace aCe acxyz Ace
Ace
mdb
ACE Bob
"indexBounds" : {
"gamerTag" : [
“["", {})",
"[/^Ace$/i, /^Ace$/i]"
]
}
Matching games:
{ gamerTag: "Ace", score: 9500 },
{ gamerTag: "ACE", score: 500 },
{ gamerTag: "aCe", score: 9500 },
{ gamerTag: "ace", score: 0 }
Non-matching games:
{ gamerTag: "Bob", score: 500 },
{ gamerTag: "acxyz", score: 9500 },
{ gamerTag: "Ace mdb", score: 9500 }
Recommendations
Case insensitive index!
Collations available since 3.4
Recommendations
Case insensitive index!
Collations available since 3.4
db.games.createIndex( { gamerTag: 1},
{ collation: { locale: 'en', strength: 2 } } )
Recommendations
Case insensitive index!
Collations available since 3.4
db.games.createIndex( { gamerTag: 1},
{ collation: { locale: 'en', strength: 2 } } )
> db.games.find( { gamerTag: "Ace"}).collation( { locale: 'en', strength: 2 } )
Recommendations
Case insensitive index!
Collations available since 3.4
db.games.createIndex( { gamerTag: 1},
{ collation: { locale: 'en', strength: 2 } } )
> db.games.find( { gamerTag: "Ace"}).collation( { locale: 'en', strength: 2 } )
{ "_id" : ObjectId("5b29dbee6c7d4f531bf73b5d"), "gamerTag" : "Ace", "score" : 9500 }
{ "_id" : ObjectId("5b29dbee6c7d4f531bf73b5e"), "gamerTag" : "ACE", "score" : 500 }
{ "_id" : ObjectId("5b29dbee6c7d4f531bf73b5f"), "gamerTag" : "aCe", "score" : 9500 }
{ "_id" : ObjectId("5b29dbee6c7d4f531bf73b60"), "gamerTag" : "ace", "score" : 0 }
Recommendations
Case insensitive index!
Collations available since 3.4
db.games.createIndex( { gamerTag: 1},
{ collation: { locale: 'en', strength: 2 } } )
Store a transformed (eg toLower()) copy of the string
db.games.find({gamerTag: “Ace”})
.collation({locale:'en', strength:2})
Stakeholder Concern #3
db.games.find({gamerTag: “Ace”})
.collation({locale:'en', strength:2})
Stakeholder Concern #3
db.games.find({gamerTag: “Ace”})
.collation({locale:'en', strength:2})
“Wow, MongoDB can do anything!!!!1!”
Stakeholder Concern #3
Summary
Work Smarter Not Harder
Understand the business logic
Index appropriately
Is it the right index to support the query?
Be aware of:
Blocking Stages
Usage of $or
Case sensitivity
Leverage the Performance Advisor
Work Smarter Not Harder
Understand the business logic
Index appropriately
Is it the right index to support the query?
Be aware of:
Blocking Stages
Usage of $or
Case sensitivity
Leverage the Performance Advisor
Countdown to ShortFite
Powered by an optimized MongoDB
environment, ShortFite is sure to be a hit!
Queries?
Results
Before/after metrics comparison

Mais conteúdo relacionado

Mais procurados

Gareth hayes. non alphanumeric javascript-php and shared fuzzing
Gareth hayes. non alphanumeric javascript-php and shared fuzzingGareth hayes. non alphanumeric javascript-php and shared fuzzing
Gareth hayes. non alphanumeric javascript-php and shared fuzzing
Yury Chemerkin
 
A3 sec -_regular_expressions
A3 sec -_regular_expressionsA3 sec -_regular_expressions
A3 sec -_regular_expressions
a3sec
 
Firebase_not_really_yohoho
Firebase_not_really_yohohoFirebase_not_really_yohoho
Firebase_not_really_yohoho
Roman Sachenko
 
appengine java night #1
appengine java night #1appengine java night #1
appengine java night #1
Shinichi Ogawa
 

Mais procurados (20)

MongoDB .local London 2019: Tips and Tricks++ for Querying and Indexing MongoDB
MongoDB .local London 2019: Tips and Tricks++ for Querying and Indexing MongoDBMongoDB .local London 2019: Tips and Tricks++ for Querying and Indexing MongoDB
MongoDB .local London 2019: Tips and Tricks++ for Querying and Indexing MongoDB
 
MongoDB .local Bengaluru 2019: Tips and Tricks++ for Querying and Indexing Mo...
MongoDB .local Bengaluru 2019: Tips and Tricks++ for Querying and Indexing Mo...MongoDB .local Bengaluru 2019: Tips and Tricks++ for Querying and Indexing Mo...
MongoDB .local Bengaluru 2019: Tips and Tricks++ for Querying and Indexing Mo...
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
Heroku Waza 2013 Lessons Learned
Heroku Waza 2013 Lessons LearnedHeroku Waza 2013 Lessons Learned
Heroku Waza 2013 Lessons Learned
 
MongoDB .local Toronto 2019: Tips and Tricks for Effective Indexing
MongoDB .local Toronto 2019: Tips and Tricks for Effective IndexingMongoDB .local Toronto 2019: Tips and Tricks for Effective Indexing
MongoDB .local Toronto 2019: Tips and Tricks for Effective Indexing
 
Elasticsearch at Dailymotion
Elasticsearch at DailymotionElasticsearch at Dailymotion
Elasticsearch at Dailymotion
 
Python WATs: Uncovering Odd Behavior
Python WATs: Uncovering Odd BehaviorPython WATs: Uncovering Odd Behavior
Python WATs: Uncovering Odd Behavior
 
Search Engines: How They Work and Why You Need Them
Search Engines: How They Work and Why You Need ThemSearch Engines: How They Work and Why You Need Them
Search Engines: How They Work and Why You Need Them
 
Gareth hayes. non alphanumeric javascript-php and shared fuzzing
Gareth hayes. non alphanumeric javascript-php and shared fuzzingGareth hayes. non alphanumeric javascript-php and shared fuzzing
Gareth hayes. non alphanumeric javascript-php and shared fuzzing
 
MySQLConf2009: Taking ActiveRecord to the Next Level
MySQLConf2009: Taking ActiveRecord to the Next LevelMySQLConf2009: Taking ActiveRecord to the Next Level
MySQLConf2009: Taking ActiveRecord to the Next Level
 
What Have The Properties Ever Done For Us
What Have The Properties Ever Done For UsWhat Have The Properties Ever Done For Us
What Have The Properties Ever Done For Us
 
Intro to OTP in Elixir
Intro to OTP in ElixirIntro to OTP in Elixir
Intro to OTP in Elixir
 
How else can you write the code in PHP?
How else can you write the code in PHP?How else can you write the code in PHP?
How else can you write the code in PHP?
 
Introduction to Search Systems - ScaleConf Colombia 2017
Introduction to Search Systems - ScaleConf Colombia 2017Introduction to Search Systems - ScaleConf Colombia 2017
Introduction to Search Systems - ScaleConf Colombia 2017
 
A3 sec -_regular_expressions
A3 sec -_regular_expressionsA3 sec -_regular_expressions
A3 sec -_regular_expressions
 
Casting for not so strange actors
Casting for not so strange actorsCasting for not so strange actors
Casting for not so strange actors
 
Firebase_not_really_yohoho
Firebase_not_really_yohohoFirebase_not_really_yohoho
Firebase_not_really_yohoho
 
Firebase not really_yohoho
Firebase not really_yohohoFirebase not really_yohoho
Firebase not really_yohoho
 
A Search Index is Not a Database Index - Full Stack Toronto 2017
A Search Index is Not a Database Index - Full Stack Toronto 2017A Search Index is Not a Database Index - Full Stack Toronto 2017
A Search Index is Not a Database Index - Full Stack Toronto 2017
 
appengine java night #1
appengine java night #1appengine java night #1
appengine java night #1
 

Semelhante a Tips and Tricks for Avoiding Common Query Pitfalls

Graph Databases
Graph DatabasesGraph Databases
Graph Databases
Josh Adell
 

Semelhante a Tips and Tricks for Avoiding Common Query Pitfalls (20)

MongoDB.local Dallas 2019: Tips & Tricks for Avoiding Common Query Pitfalls
MongoDB.local Dallas 2019: Tips & Tricks for Avoiding Common Query PitfallsMongoDB.local Dallas 2019: Tips & Tricks for Avoiding Common Query Pitfalls
MongoDB.local Dallas 2019: Tips & Tricks for Avoiding Common Query Pitfalls
 
MongoDB .local London 2019: Tips and Tricks++ for Querying and Indexing MongoDB
MongoDB .local London 2019: Tips and Tricks++ for Querying and Indexing MongoDBMongoDB .local London 2019: Tips and Tricks++ for Querying and Indexing MongoDB
MongoDB .local London 2019: Tips and Tricks++ for Querying and Indexing MongoDB
 
MongoDB .local Munich 2019: Tips and Tricks++ for Querying and Indexing MongoDB
MongoDB .local Munich 2019: Tips and Tricks++ for Querying and Indexing MongoDBMongoDB .local Munich 2019: Tips and Tricks++ for Querying and Indexing MongoDB
MongoDB .local Munich 2019: Tips and Tricks++ for Querying and Indexing MongoDB
 
MongoDB World 2019: How to Keep an Average API Response Time Less than 5ms wi...
MongoDB World 2019: How to Keep an Average API Response Time Less than 5ms wi...MongoDB World 2019: How to Keep an Average API Response Time Less than 5ms wi...
MongoDB World 2019: How to Keep an Average API Response Time Less than 5ms wi...
 
MongoDB .local Houston 2019:Tips and Tricks++ for Querying and Indexing MongoDB
MongoDB .local Houston 2019:Tips and Tricks++ for Querying and Indexing MongoDBMongoDB .local Houston 2019:Tips and Tricks++ for Querying and Indexing MongoDB
MongoDB .local Houston 2019:Tips and Tricks++ for Querying and Indexing MongoDB
 
Mongo db mug_2012-02-07
Mongo db mug_2012-02-07Mongo db mug_2012-02-07
Mongo db mug_2012-02-07
 
How to win $10m - analysing DOTA2 data in R (Sheffield R Users Group - May)
How to win $10m - analysing DOTA2 data in R (Sheffield R Users Group - May)How to win $10m - analysing DOTA2 data in R (Sheffield R Users Group - May)
How to win $10m - analysing DOTA2 data in R (Sheffield R Users Group - May)
 
Mongo Baseball .NET
Mongo Baseball .NETMongo Baseball .NET
Mongo Baseball .NET
 
Sam zhang demo
Sam zhang demoSam zhang demo
Sam zhang demo
 
MongoD Essentials
MongoD EssentialsMongoD Essentials
MongoD Essentials
 
Indexing
IndexingIndexing
Indexing
 
Microsoft NERD Talk - R and Tableau - 2-4-2013
Microsoft NERD Talk - R and Tableau - 2-4-2013Microsoft NERD Talk - R and Tableau - 2-4-2013
Microsoft NERD Talk - R and Tableau - 2-4-2013
 
Mongo indexes
Mongo indexesMongo indexes
Mongo indexes
 
MongoDB World 2016: Deciphering .explain() Output
MongoDB World 2016: Deciphering .explain() OutputMongoDB World 2016: Deciphering .explain() Output
MongoDB World 2016: Deciphering .explain() Output
 
GraphQL & Relay - 串起前後端世界的橋樑
GraphQL & Relay - 串起前後端世界的橋樑GraphQL & Relay - 串起前後端世界的橋樑
GraphQL & Relay - 串起前後端世界的橋樑
 
게임을 위한 DynamoDB 사례 및 팁 - 김일호 솔루션즈 아키텍트:: AWS Cloud Track 3 Gaming
게임을 위한 DynamoDB 사례 및 팁 - 김일호 솔루션즈 아키텍트:: AWS Cloud Track 3 Gaming게임을 위한 DynamoDB 사례 및 팁 - 김일호 솔루션즈 아키텍트:: AWS Cloud Track 3 Gaming
게임을 위한 DynamoDB 사례 및 팁 - 김일호 솔루션즈 아키텍트:: AWS Cloud Track 3 Gaming
 
Graph Databases
Graph DatabasesGraph Databases
Graph Databases
 
Fazendo mágica com ElasticSearch
Fazendo mágica com ElasticSearchFazendo mágica com ElasticSearch
Fazendo mágica com ElasticSearch
 
2019 WIA - Data-Driven Product Improvements
2019 WIA - Data-Driven Product Improvements2019 WIA - Data-Driven Product Improvements
2019 WIA - Data-Driven Product Improvements
 
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & AggregationWebinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
 

Mais de MongoDB

Mais de MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 
MongoDB .local Paris 2020: Les bonnes pratiques pour sécuriser MongoDB
MongoDB .local Paris 2020: Les bonnes pratiques pour sécuriser MongoDBMongoDB .local Paris 2020: Les bonnes pratiques pour sécuriser MongoDB
MongoDB .local Paris 2020: Les bonnes pratiques pour sécuriser MongoDB
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Último (20)

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 

Tips and Tricks for Avoiding Common Query Pitfalls

  • 4. Roadmap Motivation Who am I? Three Items to be aware of: Blocking Stages Using the $or operator Case-insensitivity
  • 5. The Power of Query Optimization Query tuning results in: Improved performance Reduced resource utilization This may lead to: Improved stability and predictability A smaller hardware footprint Not uncommon to observe efficiency improvements greater than 99%
  • 6. About Me Technical Services Engineer (Support) 2.5 year tenure Member of the Technical Experts program Focus: Queries and Indexing Previously: Data Warehouse workload optimization
  • 7. About Me Technical Services Engineer (Support) 2.5 year tenure Member of the Technical Experts program Focus: Queries and Indexing Previously: Data Warehouse workload optimization
  • 8. Meet Asya DBA at Acme Game, Inc. MongoDB Champion Meet Stakeholders Others at Acme, Inc. Developers Leadership RDBMS Historically
  • 9. Acme Games Introduces... ShortFite! Brand new Battle Royale game Launching July 1st
  • 10. Stakeholder Concerns Game nearly complete Developers have learned a lot from Asya
  • 11. Stakeholder Concerns Indexes support the efficient execution of queries in MongoDB Game nearly complete Developers have learned a lot from Asya
  • 12. Stakeholder Concerns Game nearly complete Developers have learned a lot from Asya Indexes support the efficient execution of queries in MongoDB
  • 13. Stakeholder Concerns Game nearly complete Developers have learned a lot from Asya Indexes support the efficient execution of queries in MongoDB Ace Sue … …Bob
  • 14. Stakeholder Concerns Game nearly complete Developers have learned a lot from Asya App being stress tested
  • 15. Stakeholder Concerns Game nearly complete Developers have learned a lot from Asya App being stress tested Concerns over current performance
  • 16. Stakeholder Concern #1 Developers created index db.games.createIndex({ gamerTag: 1 }) This query takes several seconds to execute: db.games.find( { gamerTag: "Ace" } ).sort({score:-1}) Adding the index on score does not help! db.games.createIndex({ score: -1 })
  • 17. Developers created index db.games.createIndex({ gamerTag: 1 }) This query takes several seconds to execute: db.games.find( { gamerTag: "Ace" } ).sort({score:-1}) Adding the index on score does not help! db.games.createIndex({ score: -1 }) “Clearly MongoDB is not webscale!” Stakeholder Concern #1
  • 19. Blocking Operation Formally: “An operation which must process all input before it can begin to produce any output.” Opposite of the often desirable “fully pipelined” plan which can stream results back as soon as they are found. Commonly observed when a sort is added to a query
  • 48. Blocking Stages $sort In aggregation and find $group $bucket $count $facet Are there any other blocking operations?
  • 49. Working with blocking stages For sorting: Add a supporting index Worth the overhead in almost all circumstances For other stages: Do you need the blocking stage? Offload to secondary member
  • 50. Stakeholder Concern #1 Performance of db.games.find( { gamerTag: "Ace" } ).sort({score:-1}) “Clearly MongoDB is not webscale!”
  • 51. Stakeholder Concern #1 Performance of db.games.find( { gamerTag: "Ace" } ).sort({score:-1}) db.games.createIndex({ gamerTag: 1, score:-1 })
  • 52. Stakeholder Concern #1 Performance of db.games.find( { gamerTag: "Ace" } ).sort({score:-1}) db.games.createIndex({ gamerTag: 1, score:-1 }) "That’ll work great!”
  • 53. Stakeholder Concern #2 The $and version of a query returns quickly: db.games.find({ $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) But the $or version is slow: db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 54. Stakeholder Concern #2 The $and version of a query returns quickly: db.games.find({ $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) But the $or version is slow: db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) We just created an index with both those fields… Can it be used?
  • 55. $or
  • 56. $and example Query on games: db.games.find({ $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) Matching games: { gamerTag: "Ace", score: 9500 } Non-matching games: { gamerTag: "Ace", score: 500 }, { gamerTag: "Bob", score: 9500 }, { gamerTag: "Bob", score: 500 }
  • 57. Groups of documents score: {$gt: 9000}gamerTag: "Ace" { gamerTag: "Ace", score: 9500 } { gamerTag: "Ace", score: 500 } { gamerTag: "Bob", score: 9500 } { gamerTag: "Bob", score: 500 }
  • 58. gamerTag: "Ace" $and Venn Diagram (logical) score: {$gt: 9000} { gamerTag: "Ace", score: 9500 } { gamerTag: "Ace", score: 500 } { gamerTag: "Bob", score: 9500 } { gamerTag: "Bob", score: 500 } db.games.find({ $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 59. $and Venn Diagram (logical) { gamerTag: "Bob", score: 500 } gamerTag: "Ace" { gamerTag: "Ace", score: 500 } db.games.find({ $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 60. gamerTag: "Ace" $and Venn Diagram (logical) score: {$gt: 9000} db.games.find({ $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 61. gamerTag: "Ace" $and Venn Diagram (logical) score: {$gt: 9000} db.games.find({ $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 62. $and Venn Diagram (logical) score: {$gt: 9000}gamerTag: "Ace" db.games.find({ $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 63. $and Venn Diagram (logical) score: {$gt: 9000}gamerTag: "Ace" db.games.find({ $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 64. $and Venn Diagram (logical) score: {$gt: 9000}gamerTag: "Ace" db.games.find({ $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 65. Ace Bob {gamerTag:1 , score:-1} 500 9500 500 9500 db.games.find({ $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) $and Index Visualization
  • 66. Ace Bob {gamerTag:1 , score:-1} 500 9500 500 9500 db.games.find({ $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ], "score" : [ "[inf.0, 9000.0)" ] } $and Index Visualization
  • 67. Ace Bob 500 9500 500 9500 {gamerTag:1 , score:-1} db.games.find({ $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ], "score" : [ "[inf.0, 9000.0)" ] } $and Index Visualization
  • 68. Bob 500 9500 500 9500 Ace db.games.find({ $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ], "score" : [ "[inf.0, 9000.0)" ] } {gamerTag:1 , score:-1} $and Index Visualization
  • 69. $and Index Visualization Bob 500 500 9500 "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ], "score" : [ "[inf.0, 9000.0)" ] } Ace 9500 db.games.find({ $and : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) {gamerTag:1 , score:-1}
  • 70. $or example Query on games: db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) Matching games: { gamerTag: "Ace", score: 9500 }, { gamerTag: "Ace", score: 500 }, { gamerTag: "Bob", score: 9500 } Non-matching games: { gamerTag: "Bob", score: 500 }
  • 71. gamerTag: "Ace" $or Venn Diagram (logical) score: {$gt: 9000} { gamerTag: "Ace", score: 9500 } { gamerTag: "Ace", score: 500 } { gamerTag: "Bob", score: 9500 } { gamerTag: "Bob", score: 500 } db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 72. $or Venn Diagram (logical) score: {$gt: 9000} { gamerTag: "Ace", score: 9500 } { gamerTag: "Ace", score: 500 } { gamerTag: "Bob", score: 9500 } { gamerTag: "Bob", score: 500 } db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) gamerTag: "Ace"
  • 73. $or Venn Diagram (logical) score: {$gt: 9000} { gamerTag: "Ace", score: 9500 } { gamerTag: "Ace", score: 500 } { gamerTag: "Bob", score: 9500 } { gamerTag: "Bob", score: 500 } db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) gamerTag: "Ace"
  • 74. score: {$gt: 9000} $or Venn Diagram (logical) { gamerTag: "Ace", score: 9500 } { gamerTag: "Bob", score: 500 } db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) { gamerTag: "Ace", score: 500 } gamerTag: "Ace" { gamerTag: "Bob", score: 9500 }
  • 75. score: {$gt: 9000}gamerTag: "Ace" $or Venn Diagram (logical) { gamerTag: "Ace", score: 9500 } { gamerTag: "Ace", score: 500 } { gamerTag: "Bob", score: 9500 } { gamerTag: "Bob", score: 500 } db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 76. score: {$gt: 9000}gamerTag: "Ace" $or Venn Diagram (logical) db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 77. score: {$gt: 9000}gamerTag: "Ace" $or Venn Diagram (logical) db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 78. score: {$gt: 9000}gamerTag: "Ace" $or Venn Diagram (logical) db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 79. score: {$gt: 9000} $or Venn Diagram (logical) db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) gamerTag: "Ace"
  • 80. score: {$gt: 9000} $or Venn Diagram (logical) db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) gamerTag: "Ace"
  • 81. $or (single) Index visualization Ace Bob {gamerTag:1 , score:-1} 500 9500 500 9500 db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 82. Ace Bob {gamerTag:1 , score:-1} 500 9500 500 9500 Expected Index Bounds: "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ], "score" : [ "[inf.0, 9000]" ] } $or (single) Index visualization db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 83. Ace Bob 500 9500 500 9500 Expected Index Bounds: "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ], "score" : [ "[inf.0, 9000]" ] } {gamerTag:1 , score:-1} $or (single) Index visualization db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 84. Bob 500 9500 500 9500 {gamerTag:1 , score:-1} Ace $or (single) Index visualization db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) Expected Index Bounds: "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ], "score" : [ "[inf.0, 9000]" ] }
  • 85. Bob 500 9500 {gamerTag:1 , score:-1} Ace 500 9500 $or (single) Index visualization db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) Expected Index Bounds: "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ], "score" : [ "[inf.0, 9000]" ] }
  • 86. 500 Bob 9500 $or (single) Index visualization db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) {gamerTag:1 , score:-1} Ace 500 9500 Expected Index Bounds: "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ], "score" : [ "[inf.0, 9000]" ] }
  • 87. 500 Bob 9500 Expected Index Bounds: "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ], "score" : [ "[inf.0, 9000]" ] } Actual (Hinted) Index Bounds: "indexBounds" : { "gamerTag" : [ "[MinKey, MaxKey]" ], "score" : [ "[MaxKey, MinKey]" ] } $or (single) Index visualization db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) {gamerTag:1 , score:-1} Ace 500 9500
  • 88. Expected Index Bounds: "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ], "score" : [ "[inf.0, 9000]" ] } Actual (Hinted) Index Bounds: "indexBounds" : { "gamerTag" : [ "[MinKey, MaxKey]" ], "score" : [ "[MaxKey, MinKey]" ] } Ace Bob 500 9500 500 9500 {gamerTag:1 , score:-1} So is there anything we can do to improve the performance of this query? $or (single) Index visualization db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 90. $or (multiple) Index visualization Ace Bob {gamerTag:1 , score:-1} 500 9500 500 9500 db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 91. $or (multiple) Index visualization db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 92. $or (multiple) Index visualization Ace db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 93. $or (multiple) Index visualization Ace Bob db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 94. $or (multiple) Index visualization Ace Bob {gamerTag:1} db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 95. $or (multiple) Index visualization Ace Bob {gamerTag:1} 500 db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 96. $or (multiple) Index visualization Ace Bob {gamerTag:1} 500 9500 db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 97. $or (multiple) Index visualization Ace Bob {gamerTag:1} 500 9500 {score:1} db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] })
  • 98. $or (multiple) Index visualization Ace Bob {gamerTag:1} 500 9500 {score:1} db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ] } "indexBounds" : { "score" : [ "(9000.0, inf.0]" ] }
  • 99. $or (multiple) Index visualization Ace Bob 500 9500 {score:1}{gamerTag:1} db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ] } "indexBounds" : { "score" : [ "(9000.0, inf.0]" ] }
  • 100. $or (multiple) Index visualization Bob 500 9500 {score:1} Ace db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ] } "indexBounds" : { "score" : [ "(9000.0, inf.0]" ] } {gamerTag:1}
  • 101. $or (multiple) Index visualization Bob 500 9500 {score:1} db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ] } "indexBounds" : { "score" : [ "(9000.0, inf.0]" ] } {gamerTag:1} Ace
  • 102. $or (multiple) Index visualization Bob 500 9500 db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ] } "indexBounds" : { "score" : [ "(9000.0, inf.0]" ] } {gamerTag:1} Ace {score:1}
  • 103. $or (multiple) Index visualization Bob 500 "indexBounds" : { "gamerTag" : [ "["Ace", "Ace"]" ] } "indexBounds" : { "score" : [ "(9000.0, inf.0]" ] } 9500 db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) {gamerTag:1} Ace {score:1}
  • 105. Recommendations We already have the {gamerTag:1, score:-1} index, do we need both of these new ones? Use multiple indexes! db.data.createIndex({gamerTag: 1}) db.data.createIndex({score: 1})
  • 106. Recommendations We already have the {gamerTag:1, score:-1} index, do we need both of these new ones? Use multiple indexes! db.data.createIndex({gamerTag: 1}) db.data.createIndex({score: 1})
  • 107. Recommendations Use multiple indexes! db.data.createIndex({gamerTag: 1}) db.data.createIndex({score: 1}) Works with sorting Generate a SORT_MERGE plan
  • 108. db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) Having the right index is critical Stakeholder Concern #2
  • 109. Stakeholder Concern #2 db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) Having the right index is critical
  • 110. db.games.find({ $or : [ { gamerTag: "Ace" }, { score: {$gt: 9000} } ] }) Having the right index is critical "Super!!” Stakeholder Concern #2
  • 111. “Wait wait wait, we can’t even FIND the gamers!” A basic search on gamerTag takes several seconds already: db.games.find({gamerTag: /^Ace$/i}) “This query is SLOWER with the index than it is without it!” Stakeholder Concern #3
  • 113. Matching games: { gamerTag: "Ace", score: 9500 } Non-matching games: { gamerTag: "ACE", score: 500 }, { gamerTag: "aCe", score: 9500 }, { gamerTag: "ace", score: 0 }, { gamerTag: "Bob", score: 500 }, { gamerTag: "acxyz", score: 9500 }, { gamerTag: "Ace mdb", score: 9500 } db.games.find({ gamerTag: /^Ace$/ }) //equivalent to db.games.find({ gamerTag: “Ace” }) Case Sensitive
  • 114. Case Sensitive ace aCe acxyz Ace Ace mdb ACE Bob "indexBounds" : { "gamerTag" : [ "["Ace", "Acf")", "[/^Ace$/, /^Ace$/]" ] } Matching games: { gamerTag: "Ace", score: 9500 } Non-matching games: { gamerTag: "ACE", score: 500 }, { gamerTag: "aCe", score: 9500 }, { gamerTag: "ace", score: 0 }, { gamerTag: "Bob", score: 500 }, { gamerTag: "acxyz", score: 9500 }, { gamerTag: "Ace mdb", score: 9500 }
  • 115. Matching games: { gamerTag: "Ace", score: 9500 }, { gamerTag: "ACE", score: 500 }, { gamerTag: "aCe", score: 9500 }, { gamerTag: "ace", score: 0 } Non-matching games: { gamerTag: "Bob", score: 500 }, { gamerTag: "acxyz", score: 9500 }, { gamerTag: "Ace mdb", score: 9500 } Case Insensitive db.games.find({ gamerTag: /^Ace$/i }) //equivalent to db.games.find({ gamerTag: { $regex: “^Ace$”, $options: “i” } }) //equivalent to db.games.find({ gamerTag: “Ace”}) .collation({locale:’en’, strength:2})
  • 116. Case Insensitive db.games.find({ gamerTag: /^Ace$/i }) //equivalent to db.games.find({ gamerTag: { $regex: “^Ace$”, $options: “i” } }) //equivalent to db.games.find({ gamerTag: “Ace”}) .collation({locale:’en’, strength:2}) Would a $text search be the same as well? Matching games: { gamerTag: "Ace", score: 9500 }, { gamerTag: "ACE", score: 500 }, { gamerTag: "aCe", score: 9500 }, { gamerTag: "ace", score: 0 } Non-matching games: { gamerTag: "Bob", score: 500 }, { gamerTag: "acxyz", score: 9500 }, { gamerTag: "Ace mdb", score: 9500 }
  • 117. Case Insensitive ace aCe acxyz Ace Ace mdb ACE Bob "indexBounds" : { "gamerTag" : [ “["", {})", "[/^Ace$/i, /^Ace$/i]" ] } Matching games: { gamerTag: "Ace", score: 9500 }, { gamerTag: "ACE", score: 500 }, { gamerTag: "aCe", score: 9500 }, { gamerTag: "ace", score: 0 } Non-matching games: { gamerTag: "Bob", score: 500 }, { gamerTag: "acxyz", score: 9500 }, { gamerTag: "Ace mdb", score: 9500 }
  • 119. Recommendations Case insensitive index! Collations available since 3.4 db.games.createIndex( { gamerTag: 1}, { collation: { locale: 'en', strength: 2 } } )
  • 120. Recommendations Case insensitive index! Collations available since 3.4 db.games.createIndex( { gamerTag: 1}, { collation: { locale: 'en', strength: 2 } } ) > db.games.find( { gamerTag: "Ace"}).collation( { locale: 'en', strength: 2 } )
  • 121. Recommendations Case insensitive index! Collations available since 3.4 db.games.createIndex( { gamerTag: 1}, { collation: { locale: 'en', strength: 2 } } ) > db.games.find( { gamerTag: "Ace"}).collation( { locale: 'en', strength: 2 } ) { "_id" : ObjectId("5b29dbee6c7d4f531bf73b5d"), "gamerTag" : "Ace", "score" : 9500 } { "_id" : ObjectId("5b29dbee6c7d4f531bf73b5e"), "gamerTag" : "ACE", "score" : 500 } { "_id" : ObjectId("5b29dbee6c7d4f531bf73b5f"), "gamerTag" : "aCe", "score" : 9500 } { "_id" : ObjectId("5b29dbee6c7d4f531bf73b60"), "gamerTag" : "ace", "score" : 0 }
  • 122. Recommendations Case insensitive index! Collations available since 3.4 db.games.createIndex( { gamerTag: 1}, { collation: { locale: 'en', strength: 2 } } ) Store a transformed (eg toLower()) copy of the string
  • 125. db.games.find({gamerTag: “Ace”}) .collation({locale:'en', strength:2}) “Wow, MongoDB can do anything!!!!1!” Stakeholder Concern #3
  • 127. Work Smarter Not Harder Understand the business logic Index appropriately Is it the right index to support the query? Be aware of: Blocking Stages Usage of $or Case sensitivity Leverage the Performance Advisor
  • 128. Work Smarter Not Harder Understand the business logic Index appropriately Is it the right index to support the query? Be aware of: Blocking Stages Usage of $or Case sensitivity Leverage the Performance Advisor
  • 129. Countdown to ShortFite Powered by an optimized MongoDB environment, ShortFite is sure to be a hit!
  • 131.