SlideShare uma empresa Scribd logo
1 de 30
Baixar para ler offline
MongoDB Indexing
Constraints & Creative
Schemas
Chris Winslett
chris@mongohq.com
Thursday, June 27, 13
My Background
• For the past year, I’ve looked at MongoDB
logs at least once every day.
• We routinely answer the question “how
can I improve performance?”
Thursday, June 27, 13
Who’s this talk for?
• New to MongoDB
• Seeing some slow operations, and need
help debugging
• Running database operations on a sizeable
deploy
• I have a MongoDB deployment, and I’ve hit
a performance wall
Thursday, June 27, 13
What should you learn?
Know where to look on a running MongoDB
to uncover slowness, and discuss solutions.
MongoDB has performance “patterns”.
How to think about improving performance.
And . . .
Thursday, June 27, 13
Schema Design
Design with the end in mind.
Thursday, June 27, 13
MongoDB Indexing
Constraints
• One index per query *
• One range operator per query ($)
• Range operator must be last field in index
• Using RAM well
* except $or, but the sin with $or is appending a sort to the query.
Thursday, June 27, 13
The Tools
• `mongostat` for MongoDB Behavior
• `tail` the logs for current options
• `iostat` for disk util
• `top -c` for CPU usage
Thursday, June 27, 13
First, a Simple One
query getmore command res faults locked db ar|aw netIn netOut conn time
129 4 7 126m 2 my_db:0.0% 3|0 27k 445k 42 15:36:54
64 4 3 126m 0 my_db:0.0% 5|0 12k 379k 42 15:36:55
65 7 8 126m 0 my_db:0.1% 3|0 15k 230k 42 15:36:56
65 3 3 126m 1 my_db:0.0% 3|0 13k 170k 42 15:36:57
66 1 6 126m 1 my_db:0.0% 0|0 14k 262k 42 15:36:58
32 8 5 126m 0 my_db:0.0% 5|0 5k 445k 42 15:36:59
a truncated mongostat
Alerted due to high CPU
Thursday, June 27, 13
log
[conn73454] query my_db.my_collection query: { $query:
{ publisher: "US Weekly" }, orderby: { publishAt: -1 } }
ntoreturn:5 ntoskip:0 nscanned:33236 scanAndOrder:1
keyUpdates:0 numYields: 21 locks(micros) r:317266
nreturned:5 reslen:3127 178ms
Thursday, June 27, 13
Solution
{ $query: { publisher: "US Weekly" }, orderby: { publishedAt: -1 } }
db.my_collection.ensureIndex({“publisher”: 1, publishedAt: -1}, {background: true})
We are fixing this query
With this index
I would show you the logs, but now they are silent.
Thursday, June 27, 13
The Pattern
Inefficient Read Queries from in-memory
table scans cause high CPU load
Caused by not matching indexes to queries.
Thursday, June 27, 13
Example 2
MongoDB Twitter-ish Feed
Customer was building a
network graph of users.
Thursday, June 27, 13
Naive Method
{
creator_id: ObjectId(),
status:“This is so awesome!”
}
Statuses Users
{
_id: ObjectId(),
friends: [array-o-friends]
}
db.status.find({creator_id: {$in: [array-o-friends]}}).sort({_id: -1})
Query
Thursday, June 27, 13
Solution
{
creator_id: ObjectId(),
friends_of_creator: [array-of-viewers],
status:“This is so awesome!”
}
Statuses Users
{
_id: ObjectId(),
friends: [array-o-friends]
}
db.statuses.find({friends_of_creator: ObjectId()}).sort({_id: -1})
Query
Thursday, June 27, 13
The Pattern
With graphs, query on viewable by.
What worked with minimal documents was not scaling.
Thursday, June 27, 13
Similar Issues - Messages
{
sender_id: ObjectId(),
recipient_id: ObjectId(),
message:“This is so awesome!”
}
Naive
{
sender_id: ObjectId(),
recipient_id: ObjectId(),
participants: [ObjectId(), ObjectId()],
thread_id: ObjectId(),
message:“This is so awesome!”
}
Solution
db.messages.find({participants: ObjectId()}).sort({_id: -1})
Query
db.messages.find({$or: [{sender_id: ObjectId()}, {recipient_id: ObjectId()]}).sort({_id: -1})
Naive Query
Thursday, June 27, 13
Example 3
insert query update delete getmore command faults locked % idx miss % qr|qw ar|aw
*0 *0 *0 *0 0 1|0 1422 0 0 0|0 50|0
*0 6 *0 *0 0 6|0 575 0 0 0|0 51|0
*0 3 *0 *0 0 1|0 1047 0 0 0|0 50|0
*0 2 *0 *0 0 3|0 1660 0 0 0|0 50|0
a truncated mongostat
Alerted on high CPU
Thursday, June 27, 13
tail
[initandlisten] connection accepted from ....
[conn4229724] authenticate: { authenticate: ....
[initandlisten] connection accepted from ....
[conn4229725] authenticate: { authenticate: .....
[conn4229717] query ..... 102ms
[conn4229725] query ..... 140ms
amazingly quiet
Thursday, June 27, 13
currentOp
> db.currentOP()
{
	

 "inprog" : [
	

 	

 {
	

 	

 	

 "opid" : 66178716,
	

 	

 	

 "lockType" : "read",
	

 	

 	

 "secs_running" : 760,
	

 	

 	

 "op" : "query",
	

 	

 	

 "ns" : "my_db.my_collection",
	

 	

 	

 "query" : {
keywords: $in: [“keyword1”,“keyword2”],
tags: $in: [“tags1”,“tags2”]
	

 	

 	

 },
orderby: {
“created_at”: -1
},
	

 	

 	

 "numYields" : 21
	

 	

 }
]
}
Thursday, June 27, 13
Solution
> db.currentOP().inprog.filter(function(row) {
return row.secs_running > 100 && row.op == "query"
}).forEach(function(row) {
db.killOp(row.opid)
})
Return Stability to Database
Disable query, and refactor schema.
Thursday, June 27, 13
Refactoring
I have one word for you,“Schema”
Thursday, June 27, 13
Example 4
A map reduce has gradually run
slower and slower.
Thursday, June 27, 13
Finding Offenders
Find the time of the slowest query of the day:
grep '[0-9]{3,100}ms$' $MONGODB_LOG | awk '{print $NF}' | sort -n
Thursday, June 27, 13
Slowest Map Reduce
my_db.$cmd command: {
mapreduce: "my_collection",
map: function() {},
query: { $or: [
{ object.type: "this" },
{ object.type: "that" }
],
time: { $lt: new Date(1359025311290), $gt: new Date(1358420511290) },
object.ver: 1,
origin: "tnh"
},
out: "my_new_collection",
reduce: function(keys, vals) { ....}
} ntoreturn:1 keyUpdates:0 numYields: 32696 locks(micros)
W:143870 r:511858643 w:6279425 reslen:140 421185ms
Thursday, June 27, 13
Solution
Query is slow because it has multiple multi-value operators: $or, $gte, and $lte
Problem
Solution
Change schema to use an “hour_created” attribute:
hour_created: “%Y-%m-%d %H”
Create an index on “hour_created” with followed by “$or” values. Query
using the new “hour_created.”
Thursday, June 27, 13
Words of caution
2 / 4 solutions were to add an index.
New indexes as a solution scales poorly.
Thursday, June 27, 13
Sometimes . . .
It is best to do nothing, except add shards / add hardware.
Go back to the drawing board on the design.
Thursday, June 27, 13
Bad things happen to
good databases?
• ORMs
• Manage your indexes and queries.
• Constraints will set you free.
Thursday, June 27, 13
Road Map for
Refactoring
• Measure, measure, measure.
• Find your slowest queries and determine if
they can be indexed
• Rephrase the problem you are solving by
asking “How do I want to query my data?”
Thursday, June 27, 13
Thank you!
• Questions?
• E-mail me: chris@mongohq.com
Thursday, June 27, 13

Mais conteúdo relacionado

Mais procurados

Advanced Postgres Monitoring
Advanced Postgres MonitoringAdvanced Postgres Monitoring
Advanced Postgres Monitoring
Denish Patel
 
MongoDB Tick Data Presentation
MongoDB Tick Data PresentationMongoDB Tick Data Presentation
MongoDB Tick Data Presentation
MongoDB
 
PostgreSQL Performance Tuning
PostgreSQL Performance TuningPostgreSQL Performance Tuning
PostgreSQL Performance Tuning
elliando dias
 
Indexing & Query Optimization
Indexing & Query OptimizationIndexing & Query Optimization
Indexing & Query Optimization
MongoDB
 

Mais procurados (20)

Advanced Postgres Monitoring
Advanced Postgres MonitoringAdvanced Postgres Monitoring
Advanced Postgres Monitoring
 
MariaDB 10.5 binary install (바이너리 설치)
MariaDB 10.5 binary install (바이너리 설치)MariaDB 10.5 binary install (바이너리 설치)
MariaDB 10.5 binary install (바이너리 설치)
 
Contoh Soal Relasi Biner
Contoh Soal Relasi BinerContoh Soal Relasi Biner
Contoh Soal Relasi Biner
 
Mongo DB 완벽가이드 - 4장 쿼리하기
Mongo DB 완벽가이드 - 4장 쿼리하기Mongo DB 완벽가이드 - 4장 쿼리하기
Mongo DB 완벽가이드 - 4장 쿼리하기
 
PostgreSQL 15 and its Major Features -(Aakash M - Mydbops) - Mydbops Opensour...
PostgreSQL 15 and its Major Features -(Aakash M - Mydbops) - Mydbops Opensour...PostgreSQL 15 and its Major Features -(Aakash M - Mydbops) - Mydbops Opensour...
PostgreSQL 15 and its Major Features -(Aakash M - Mydbops) - Mydbops Opensour...
 
MongoDB Tick Data Presentation
MongoDB Tick Data PresentationMongoDB Tick Data Presentation
MongoDB Tick Data Presentation
 
PostgreSQL Performance Tuning
PostgreSQL Performance TuningPostgreSQL Performance Tuning
PostgreSQL Performance Tuning
 
Materi ke-1 Aljabar Linier
Materi ke-1 Aljabar LinierMateri ke-1 Aljabar Linier
Materi ke-1 Aljabar Linier
 
Indexing & Query Optimization
Indexing & Query OptimizationIndexing & Query Optimization
Indexing & Query Optimization
 
Kriptografi - Skema Pembagian Data Rahasia
Kriptografi - Skema Pembagian Data RahasiaKriptografi - Skema Pembagian Data Rahasia
Kriptografi - Skema Pembagian Data Rahasia
 
Movable Type 7 のすべて
Movable Type 7 のすべてMovable Type 7 のすべて
Movable Type 7 のすべて
 
Algoritma dan Struktur Data - Pseudocode
Algoritma dan Struktur Data - PseudocodeAlgoritma dan Struktur Data - Pseudocode
Algoritma dan Struktur Data - Pseudocode
 
Mongo DB schema design patterns
Mongo DB schema design patternsMongo DB schema design patterns
Mongo DB schema design patterns
 
A normalized gaussian wasserstein distance for tiny object detection 1
A normalized gaussian wasserstein distance for tiny object detection 1A normalized gaussian wasserstein distance for tiny object detection 1
A normalized gaussian wasserstein distance for tiny object detection 1
 
Indexes in postgres
Indexes in postgresIndexes in postgres
Indexes in postgres
 
Redis and its many use cases
Redis and its many use casesRedis and its many use cases
Redis and its many use cases
 
[pgday.Seoul 2022] 서비스개편시 PostgreSQL 도입기 - 진소린 & 김태정
[pgday.Seoul 2022] 서비스개편시 PostgreSQL 도입기 - 진소린 & 김태정[pgday.Seoul 2022] 서비스개편시 PostgreSQL 도입기 - 진소린 & 김태정
[pgday.Seoul 2022] 서비스개편시 PostgreSQL 도입기 - 진소린 & 김태정
 
Bab 2 vektor
Bab 2 vektorBab 2 vektor
Bab 2 vektor
 
Algoritma Apriori
Algoritma AprioriAlgoritma Apriori
Algoritma Apriori
 
개발자들이 흔히 실수하는 SQL 7가지
개발자들이 흔히 실수하는 SQL 7가지개발자들이 흔히 실수하는 SQL 7가지
개발자들이 흔히 실수하는 SQL 7가지
 

Destaque

NoSQL i dlaczego go nie potrzebujesz? [OlCamp]
NoSQL i dlaczego go nie potrzebujesz? [OlCamp]NoSQL i dlaczego go nie potrzebujesz? [OlCamp]
NoSQL i dlaczego go nie potrzebujesz? [OlCamp]
Filip Tepper
 
Indexing with MongoDB
Indexing with MongoDBIndexing with MongoDB
Indexing with MongoDB
lehresman
 
User Data Management with MongoDB
User Data Management with MongoDB User Data Management with MongoDB
User Data Management with MongoDB
MongoDB
 
Indexing and Query Optimizer (Aaron Staple)
Indexing and Query Optimizer (Aaron Staple)Indexing and Query Optimizer (Aaron Staple)
Indexing and Query Optimizer (Aaron Staple)
MongoSF
 
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB
 

Destaque (12)

NoSQL i dlaczego go nie potrzebujesz? [OlCamp]
NoSQL i dlaczego go nie potrzebujesz? [OlCamp]NoSQL i dlaczego go nie potrzebujesz? [OlCamp]
NoSQL i dlaczego go nie potrzebujesz? [OlCamp]
 
Geo-Indexing w/MongoDB
Geo-Indexing w/MongoDBGeo-Indexing w/MongoDB
Geo-Indexing w/MongoDB
 
Indexing with MongoDB
Indexing with MongoDBIndexing with MongoDB
Indexing with MongoDB
 
User Data Management with MongoDB
User Data Management with MongoDB User Data Management with MongoDB
User Data Management with MongoDB
 
Phplx mongodb
Phplx mongodbPhplx mongodb
Phplx mongodb
 
Indexing and Query Optimizer (Richard Kreuter)
Indexing and Query Optimizer (Richard Kreuter)Indexing and Query Optimizer (Richard Kreuter)
Indexing and Query Optimizer (Richard Kreuter)
 
Indexing and Query Optimizer (Aaron Staple)
Indexing and Query Optimizer (Aaron Staple)Indexing and Query Optimizer (Aaron Staple)
Indexing and Query Optimizer (Aaron Staple)
 
Indexing
IndexingIndexing
Indexing
 
Fast querying indexing for performance (4)
Fast querying   indexing for performance (4)Fast querying   indexing for performance (4)
Fast querying indexing for performance (4)
 
Indexing with MongoDB
Indexing with MongoDBIndexing with MongoDB
Indexing with MongoDB
 
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
 
Webinar: Working with Graph Data in MongoDB
Webinar: Working with Graph Data in MongoDBWebinar: Working with Graph Data in MongoDB
Webinar: Working with Graph Data in MongoDB
 

Semelhante a MongoDB Indexing Constraints and Creative Schemas

Optimizing Slow Queries with Indexes and Creativity
Optimizing Slow Queries with Indexes and CreativityOptimizing Slow Queries with Indexes and Creativity
Optimizing Slow Queries with Indexes and Creativity
MongoDB
 
MongoDB Performance Debugging
MongoDB Performance DebuggingMongoDB Performance Debugging
MongoDB Performance Debugging
MongoDB
 
20070920 Highload2007 Training Performance Momjian
20070920 Highload2007 Training Performance Momjian20070920 Highload2007 Training Performance Momjian
20070920 Highload2007 Training Performance Momjian
Nikolay Samokhvalov
 
1404 app dev series - session 8 - monitoring & performance tuning
1404   app dev series - session 8 - monitoring & performance tuning1404   app dev series - session 8 - monitoring & performance tuning
1404 app dev series - session 8 - monitoring & performance tuning
MongoDB
 
Scaling MongoDB; Sharding Into and Beyond the Multi-Terabyte Range
Scaling MongoDB; Sharding Into and Beyond the Multi-Terabyte RangeScaling MongoDB; Sharding Into and Beyond the Multi-Terabyte Range
Scaling MongoDB; Sharding Into and Beyond the Multi-Terabyte Range
MongoDB
 

Semelhante a MongoDB Indexing Constraints and Creative Schemas (20)

Optimizing Slow Queries with Indexes and Creativity
Optimizing Slow Queries with Indexes and CreativityOptimizing Slow Queries with Indexes and Creativity
Optimizing Slow Queries with Indexes and Creativity
 
Mongodb debugging-performance-problems
Mongodb debugging-performance-problemsMongodb debugging-performance-problems
Mongodb debugging-performance-problems
 
MongoDB Performance Debugging
MongoDB Performance DebuggingMongoDB Performance Debugging
MongoDB Performance Debugging
 
Building a Generic Search Screen using Dynamic SQL
Building a Generic Search Screen using Dynamic SQLBuilding a Generic Search Screen using Dynamic SQL
Building a Generic Search Screen using Dynamic SQL
 
Large volume data analysis on the Typesafe Reactive Platform
Large volume data analysis on the Typesafe Reactive PlatformLarge volume data analysis on the Typesafe Reactive Platform
Large volume data analysis on the Typesafe Reactive Platform
 
Beyond PHP - It's not (just) about the code
Beyond PHP - It's not (just) about the codeBeyond PHP - It's not (just) about the code
Beyond PHP - It's not (just) about the code
 
Top Node.js Metrics to Watch
Top Node.js Metrics to WatchTop Node.js Metrics to Watch
Top Node.js Metrics to Watch
 
Advanced pg_stat_statements: Filtering, Regression Testing & more
Advanced pg_stat_statements: Filtering, Regression Testing & moreAdvanced pg_stat_statements: Filtering, Regression Testing & more
Advanced pg_stat_statements: Filtering, Regression Testing & more
 
Open Source SQL databases enter millions queries per second era
Open Source SQL databases enter millions queries per second eraOpen Source SQL databases enter millions queries per second era
Open Source SQL databases enter millions queries per second era
 
Open Source SQL databases enters millions queries per second era
Open Source SQL databases enters millions queries per second eraOpen Source SQL databases enters millions queries per second era
Open Source SQL databases enters millions queries per second era
 
Performance Tuning and Optimization
Performance Tuning and OptimizationPerformance Tuning and Optimization
Performance Tuning and Optimization
 
Replication
ReplicationReplication
Replication
 
Database and application performance vivek sharma
Database and application performance vivek sharmaDatabase and application performance vivek sharma
Database and application performance vivek sharma
 
20070920 Highload2007 Training Performance Momjian
20070920 Highload2007 Training Performance Momjian20070920 Highload2007 Training Performance Momjian
20070920 Highload2007 Training Performance Momjian
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the code
 
1404 app dev series - session 8 - monitoring & performance tuning
1404   app dev series - session 8 - monitoring & performance tuning1404   app dev series - session 8 - monitoring & performance tuning
1404 app dev series - session 8 - monitoring & performance tuning
 
PostgreSQL query planner's internals
PostgreSQL query planner's internalsPostgreSQL query planner's internals
PostgreSQL query planner's internals
 
Building a Scalable Inbox System with MongoDB and Java
Building a Scalable Inbox System with MongoDB and JavaBuilding a Scalable Inbox System with MongoDB and Java
Building a Scalable Inbox System with MongoDB and Java
 
Use Your MySQL Knowledge to Become a MongoDB Guru
Use Your MySQL Knowledge to Become a MongoDB GuruUse Your MySQL Knowledge to Become a MongoDB Guru
Use Your MySQL Knowledge to Become a MongoDB Guru
 
Scaling MongoDB; Sharding Into and Beyond the Multi-Terabyte Range
Scaling MongoDB; Sharding Into and Beyond the Multi-Terabyte RangeScaling MongoDB; Sharding Into and Beyond the Multi-Terabyte Range
Scaling MongoDB; Sharding Into and Beyond the Multi-Terabyte Range
 

Mais de MongoDB

Mais de MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Último

Último (20)

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 

MongoDB Indexing Constraints and Creative Schemas

  • 1. MongoDB Indexing Constraints & Creative Schemas Chris Winslett chris@mongohq.com Thursday, June 27, 13
  • 2. My Background • For the past year, I’ve looked at MongoDB logs at least once every day. • We routinely answer the question “how can I improve performance?” Thursday, June 27, 13
  • 3. Who’s this talk for? • New to MongoDB • Seeing some slow operations, and need help debugging • Running database operations on a sizeable deploy • I have a MongoDB deployment, and I’ve hit a performance wall Thursday, June 27, 13
  • 4. What should you learn? Know where to look on a running MongoDB to uncover slowness, and discuss solutions. MongoDB has performance “patterns”. How to think about improving performance. And . . . Thursday, June 27, 13
  • 5. Schema Design Design with the end in mind. Thursday, June 27, 13
  • 6. MongoDB Indexing Constraints • One index per query * • One range operator per query ($) • Range operator must be last field in index • Using RAM well * except $or, but the sin with $or is appending a sort to the query. Thursday, June 27, 13
  • 7. The Tools • `mongostat` for MongoDB Behavior • `tail` the logs for current options • `iostat` for disk util • `top -c` for CPU usage Thursday, June 27, 13
  • 8. First, a Simple One query getmore command res faults locked db ar|aw netIn netOut conn time 129 4 7 126m 2 my_db:0.0% 3|0 27k 445k 42 15:36:54 64 4 3 126m 0 my_db:0.0% 5|0 12k 379k 42 15:36:55 65 7 8 126m 0 my_db:0.1% 3|0 15k 230k 42 15:36:56 65 3 3 126m 1 my_db:0.0% 3|0 13k 170k 42 15:36:57 66 1 6 126m 1 my_db:0.0% 0|0 14k 262k 42 15:36:58 32 8 5 126m 0 my_db:0.0% 5|0 5k 445k 42 15:36:59 a truncated mongostat Alerted due to high CPU Thursday, June 27, 13
  • 9. log [conn73454] query my_db.my_collection query: { $query: { publisher: "US Weekly" }, orderby: { publishAt: -1 } } ntoreturn:5 ntoskip:0 nscanned:33236 scanAndOrder:1 keyUpdates:0 numYields: 21 locks(micros) r:317266 nreturned:5 reslen:3127 178ms Thursday, June 27, 13
  • 10. Solution { $query: { publisher: "US Weekly" }, orderby: { publishedAt: -1 } } db.my_collection.ensureIndex({“publisher”: 1, publishedAt: -1}, {background: true}) We are fixing this query With this index I would show you the logs, but now they are silent. Thursday, June 27, 13
  • 11. The Pattern Inefficient Read Queries from in-memory table scans cause high CPU load Caused by not matching indexes to queries. Thursday, June 27, 13
  • 12. Example 2 MongoDB Twitter-ish Feed Customer was building a network graph of users. Thursday, June 27, 13
  • 13. Naive Method { creator_id: ObjectId(), status:“This is so awesome!” } Statuses Users { _id: ObjectId(), friends: [array-o-friends] } db.status.find({creator_id: {$in: [array-o-friends]}}).sort({_id: -1}) Query Thursday, June 27, 13
  • 14. Solution { creator_id: ObjectId(), friends_of_creator: [array-of-viewers], status:“This is so awesome!” } Statuses Users { _id: ObjectId(), friends: [array-o-friends] } db.statuses.find({friends_of_creator: ObjectId()}).sort({_id: -1}) Query Thursday, June 27, 13
  • 15. The Pattern With graphs, query on viewable by. What worked with minimal documents was not scaling. Thursday, June 27, 13
  • 16. Similar Issues - Messages { sender_id: ObjectId(), recipient_id: ObjectId(), message:“This is so awesome!” } Naive { sender_id: ObjectId(), recipient_id: ObjectId(), participants: [ObjectId(), ObjectId()], thread_id: ObjectId(), message:“This is so awesome!” } Solution db.messages.find({participants: ObjectId()}).sort({_id: -1}) Query db.messages.find({$or: [{sender_id: ObjectId()}, {recipient_id: ObjectId()]}).sort({_id: -1}) Naive Query Thursday, June 27, 13
  • 17. Example 3 insert query update delete getmore command faults locked % idx miss % qr|qw ar|aw *0 *0 *0 *0 0 1|0 1422 0 0 0|0 50|0 *0 6 *0 *0 0 6|0 575 0 0 0|0 51|0 *0 3 *0 *0 0 1|0 1047 0 0 0|0 50|0 *0 2 *0 *0 0 3|0 1660 0 0 0|0 50|0 a truncated mongostat Alerted on high CPU Thursday, June 27, 13
  • 18. tail [initandlisten] connection accepted from .... [conn4229724] authenticate: { authenticate: .... [initandlisten] connection accepted from .... [conn4229725] authenticate: { authenticate: ..... [conn4229717] query ..... 102ms [conn4229725] query ..... 140ms amazingly quiet Thursday, June 27, 13
  • 19. currentOp > db.currentOP() { "inprog" : [ { "opid" : 66178716, "lockType" : "read", "secs_running" : 760, "op" : "query", "ns" : "my_db.my_collection", "query" : { keywords: $in: [“keyword1”,“keyword2”], tags: $in: [“tags1”,“tags2”] }, orderby: { “created_at”: -1 }, "numYields" : 21 } ] } Thursday, June 27, 13
  • 20. Solution > db.currentOP().inprog.filter(function(row) { return row.secs_running > 100 && row.op == "query" }).forEach(function(row) { db.killOp(row.opid) }) Return Stability to Database Disable query, and refactor schema. Thursday, June 27, 13
  • 21. Refactoring I have one word for you,“Schema” Thursday, June 27, 13
  • 22. Example 4 A map reduce has gradually run slower and slower. Thursday, June 27, 13
  • 23. Finding Offenders Find the time of the slowest query of the day: grep '[0-9]{3,100}ms$' $MONGODB_LOG | awk '{print $NF}' | sort -n Thursday, June 27, 13
  • 24. Slowest Map Reduce my_db.$cmd command: { mapreduce: "my_collection", map: function() {}, query: { $or: [ { object.type: "this" }, { object.type: "that" } ], time: { $lt: new Date(1359025311290), $gt: new Date(1358420511290) }, object.ver: 1, origin: "tnh" }, out: "my_new_collection", reduce: function(keys, vals) { ....} } ntoreturn:1 keyUpdates:0 numYields: 32696 locks(micros) W:143870 r:511858643 w:6279425 reslen:140 421185ms Thursday, June 27, 13
  • 25. Solution Query is slow because it has multiple multi-value operators: $or, $gte, and $lte Problem Solution Change schema to use an “hour_created” attribute: hour_created: “%Y-%m-%d %H” Create an index on “hour_created” with followed by “$or” values. Query using the new “hour_created.” Thursday, June 27, 13
  • 26. Words of caution 2 / 4 solutions were to add an index. New indexes as a solution scales poorly. Thursday, June 27, 13
  • 27. Sometimes . . . It is best to do nothing, except add shards / add hardware. Go back to the drawing board on the design. Thursday, June 27, 13
  • 28. Bad things happen to good databases? • ORMs • Manage your indexes and queries. • Constraints will set you free. Thursday, June 27, 13
  • 29. Road Map for Refactoring • Measure, measure, measure. • Find your slowest queries and determine if they can be indexed • Rephrase the problem you are solving by asking “How do I want to query my data?” Thursday, June 27, 13
  • 30. Thank you! • Questions? • E-mail me: chris@mongohq.com Thursday, June 27, 13