Mythbusting: Understanding How We Measure the Performance of MongoDB
1. Senior Director of Performance Engineering, MongoDB
Alvin Richards
#MongoDBWorld
Mythbusting: Understanding
How We Measure the
Performance of MongoDB
2. Before we start…
• We are going to look a lot at
– C++ kernel code
– Java benchmarks
– JavaScript tests
• And lots of charts…
6. The Milk Train Doesn't Stop Here Anymore
Tennessee Williams
"We all live in a house on fire, no fire department
to call; no way out, just the upstairs window to
look out of while the fire burns the house down
with us trapped, locked in it."
7. long startTime = System.currentTimeMillis();
for (int roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i < documentsPerInsert; i++) {
id++;
BasicDBObject doc = new BasicDBObject();
doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1);
String cVal = "…"
doc.put("c",cVal);
String padVal = "…";
doc.put("pad",padVal);
aDocs[i]=doc;
}
coll.insert(aDocs);
numInserts += documentsPerInsert;
globalInserts.addAndGet(documentsPerInsert);
}
long endTime = System.currentTimeMillis();
#1 Time taken to Insert x
Documents
8. long startTime = System.currentTimeMillis();
for (int roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i < documentsPerInsert; i++) {
id++;
BasicDBObject doc = new BasicDBObject();
doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1);
String cVal = "…"
doc.put("c",cVal);
String padVal = "…";
doc.put("pad",padVal);
aDocs[i]=doc;
}
coll.insert(aDocs);
numInserts += documentsPerInsert;
globalInserts.addAndGet(documentsPerInsert);
}
long endTime = System.currentTimeMillis();
#1 Time taken to Insert x
Documents
9. long startTime = System.currentTimeMillis();
for (int roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i < documentsPerInsert; i++) {
id++;
BasicDBObject doc = new BasicDBObject();
doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1);
String cVal = "…"
doc.put("c",cVal);
String padVal = "…";
doc.put("pad",padVal);
aDocs[i]=doc;
}
coll.insert(aDocs);
numInserts += documentsPerInsert;
globalInserts.addAndGet(documentsPerInsert);
}
long endTime = System.currentTimeMillis();
#1 Time taken to Insert x
Documents
10. long startTime = System.currentTimeMillis();
for (int roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i < documentsPerInsert; i++) {
id++;
BasicDBObject doc = new BasicDBObject();
doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1);
String cVal = "…"
doc.put("c",cVal);
String padVal = "…";
doc.put("pad",padVal);
aDocs[i]=doc;
}
coll.insert(aDocs);
numInserts += documentsPerInsert;
globalInserts.addAndGet(documentsPerInsert);
}
long endTime = System.currentTimeMillis();
#1 Time taken to Insert x
Documents
11. long startTime = System.currentTimeMillis();
for (int roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i < documentsPerInsert; i++) {
id++;
BasicDBObject doc = new BasicDBObject();
doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1);
String cVal = "…"
doc.put("c",cVal);
String padVal = "…";
doc.put("pad",padVal);
aDocs[i]=doc;
}
coll.insert(aDocs);
numInserts += documentsPerInsert;
globalInserts.addAndGet(documentsPerInsert);
}
long endTime = System.currentTimeMillis();
#1 Time taken to Insert x
Documents
12. long startTime = System.currentTimeMillis();
for (int roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i < documentsPerInsert; i++) {
id++;
BasicDBObject doc = new BasicDBObject();
doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1);
String cVal = "…"
doc.put("c",cVal);
String padVal = "…";
doc.put("pad",padVal);
aDocs[i]=doc;
}
coll.insert(aDocs);
numInserts += documentsPerInsert;
globalInserts.addAndGet(documentsPerInsert);
}
long endTime = System.currentTimeMillis();
So that looks ok, right?
13. long startTime = System.currentTimeMillis();
for (int roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i < documentsPerInsert; i++) {
id++;
BasicDBObject doc = new BasicDBObject();
doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1);
String cVal = "…"
doc.put("c",cVal);
String padVal = "…";
doc.put("pad",padVal);
aDocs[i]=doc;
}
coll.insert(aDocs);
numInserts += documentsPerInsert;
globalInserts.addAndGet(documentsPerInsert);
}
long endTime = System.currentTimeMillis();
What are else you measuring?
Object creation and GC
management?
14. long startTime = System.currentTimeMillis();
for (int roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i < documentsPerInsert; i++) {
id++;
BasicDBObject doc = new BasicDBObject();
doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1);
String cVal = "…"
doc.put("c",cVal);
String padVal = "…";
doc.put("pad",padVal);
aDocs[i]=doc;
}
coll.insert(aDocs);
numInserts += documentsPerInsert;
globalInserts.addAndGet(documentsPerInsert);
}
long endTime = System.currentTimeMillis();
What are else you measuring?
Thread contention on
nextInt()?
Object creation and GC
management?
15. long startTime = System.currentTimeMillis();
for (int roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i < documentsPerInsert; i++) {
id++;
BasicDBObject doc = new BasicDBObject();
doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1);
String cVal = "…"
doc.put("c",cVal);
String padVal = "…";
doc.put("pad",padVal);
aDocs[i]=doc;
}
coll.insert(aDocs);
numInserts += documentsPerInsert;
globalInserts.addAndGet(documentsPerInsert);
}
long endTime = System.currentTimeMillis();
What are else you measuring?
Time to synthesize data?
Object creation and GC
management?
Thread contention on
nextInt()?
16. long startTime = System.currentTimeMillis();
for (int roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i < documentsPerInsert; i++) {
id++;
BasicDBObject doc = new BasicDBObject();
doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1);
String cVal = "…"
doc.put("c",cVal);
String padVal = "…";
doc.put("pad",padVal);
aDocs[i]=doc;
}
coll.insert(aDocs);
numInserts += documentsPerInsert;
globalInserts.addAndGet(documentsPerInsert);
}
long endTime = System.currentTimeMillis();
What are else you measuring?
Object creation and GC
management?
Thread contention on
addAndGet()?
Thread contention on
nextInt()?
Time to synthesize data?
17. long startTime = System.currentTimeMillis();
for (int roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i < documentsPerInsert; i++) {
id++;
BasicDBObject doc = new BasicDBObject();
doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1);
String cVal = "…"
doc.put("c",cVal);
String padVal = "…";
doc.put("pad",padVal);
aDocs[i]=doc;
}
coll.insert(aDocs);
numInserts += documentsPerInsert;
globalInserts.addAndGet(documentsPerInsert);
}
long endTime = System.currentTimeMillis();
What are else you measuring?
Object creation and GC
management?
Clock resolution?
Thread contention on
nextInt()?
Time to synthesize data?
Thread contention on
addAndGet()?
18. // Pre Create the Object outside the Loop
BasicDBObject[] aDocs = new BasicDBObject[documentsPerInsert];
for (int i=0; i < documentsPerInsert; i++) {
BasicDBObject doc = new BasicDBObject();
String cVal = "…";
doc.put("c",cVal);
String padVal = "…";
doc.put("pad",padVal);
aDocs[i] = doc;
}
Solution: Pre-Create the objects
Pre-create non varying
data outside the timing
loop
Alternative
• Pre-create the data in a file; load from file
19. // Use ThreadLocalRandom generator or an instance of java.util.Random per thread
java.util.concurrent.ThreadLocalRandom rand;
for (long roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i < documentsPerInsert; i++) {
id++;
doc = aDocs[i];
doc.put("_id",id);
doc.put("k", nextInt(rand, numMaxInserts)+1);
}
coll.insert(aDocs);
numInserts += documentsPerInsert;
}
// Maintain count outside the loop
globalInserts.addAndGet(documentsPerInsert * roundNum);
Solution: Remove contention
Remove contention
nextInt() by making
Thread local
20. // Use ThreadLocalRandom generator or an instance of java.util.Random per thread
java.util.concurrent.ThreadLocalRandom rand;
for (long roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i < documentsPerInsert; i++) {
id++;
doc = aDocs[i];
doc.put("_id",id);
doc.put("k", nextInt(rand, numMaxInserts)+1);
}
coll.insert(aDocs);
numInserts += documentsPerInsert;
}
// Maintain count outside the loop
globalInserts.addAndGet(documentsPerInsert * roundNum);
Solution: Remove contention
Remove contention on
addAndGet()
Remove contention
nextInt() by making
Thread local
21. long startTime = System.currentTimeMillis();
…
long endTime = System.currentTimeMillis();
long startTime = System.nanoTime();
…
long endTime = System.nanoTime() - startTime;
Solution: Timer resolution
"resolution is at least as
good as that of
currentTimeMillis()"
"granularity of the value
depends on the
underlying operating
system and may be
larger"
Source
• http://docs.oracle.com/javase/7/docs/api/java/lang/System.html
23. BasicDBObject doc = new BasicDBObject();
doc.put("v", str); // str is a 2k string
for (int i=0; i < 1000; i++) {
doc.put("_id",i); coll.insert(doc);
}
BasicDBObject predicate = new BasicDBObject();
long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate);
DBObject foundObj;
while (cur.hasNext()) {
foundObj = cur.next();
}
long endTime = System.currentTimeMillis();
#2 Response time to return all
results
24. BasicDBObject doc = new BasicDBObject();
doc.put("v", str); // str is a 2k string
for (int i=0; i < 1000; i++) {
doc.put("_id",i); coll.insert(doc);
}
BasicDBObject predicate = new BasicDBObject();
long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate);
DBObject foundObj;
while (cur.hasNext()) {
foundObj = cur.next();
}
long endTime = System.currentTimeMillis();
#2 Response time to return all
results
25. BasicDBObject doc = new BasicDBObject();
doc.put("v", str); // str is a 2k string
for (int i=0; i < 1000; i++) {
doc.put("_id",i); coll.insert(doc);
}
BasicDBObject predicate = new BasicDBObject();
long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate);
DBObject foundObj;
while (cur.hasNext()) {
foundObj = cur.next();
}
long endTime = System.currentTimeMillis();
#2 Response time to return all
results
26. BasicDBObject doc = new BasicDBObject();
doc.put("v", str); // str is a 2k string
for (int i=0; i < 1000; i++) {
doc.put("_id",i); coll.insert(doc);
}
BasicDBObject predicate = new BasicDBObject();
long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate);
DBObject foundObj;
while (cur.hasNext()) {
foundObj = cur.next();
}
long endTime = System.currentTimeMillis();
#2 Response time to return all
results
27. BasicDBObject doc = new BasicDBObject();
doc.put("v", str); // str is a 2k string
for (int i=0; i < 1000; i++) {
doc.put("_id",i); coll.insert(doc);
}
BasicDBObject predicate = new BasicDBObject();
long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate);
DBObject foundObj;
while (cur.hasNext()) {
foundObj = cur.next();
}
long endTime = System.currentTimeMillis();
So that looks ok, right?
28. BasicDBObject doc = new BasicDBObject();
doc.put("v", str); // str is a 2k string
for (int i=0; i < 1000; i++) {
doc.put("_id",i); coll.insert(doc);
}
BasicDBObject predicate = new BasicDBObject();
long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate);
DBObject foundObj;
while (cur.hasNext()) {
foundObj = cur.next();
}
long endTime = System.currentTimeMillis();
What are else you measuring?
Each doc is is 4080 bytes
on disk with powerOf2Sizes
29. BasicDBObject doc = new BasicDBObject();
doc.put("v", str); // str is a 2k string
for (int i=0; i < 1000; i++) {
doc.put("_id",i); coll.insert(doc);
}
BasicDBObject predicate = new BasicDBObject();
long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate);
DBObject foundObj;
while (cur.hasNext()) {
foundObj = cur.next();
}
long endTime = System.currentTimeMillis();
What are else you measuring?
Each doc is is 4080 bytes
on disk with powerOf2Sizes
Unrestricted predicate?
30. BasicDBObject doc = new BasicDBObject();
doc.put("v", str); // str is a 2k string
for (int i=0; i < 1000; i++) {
doc.put("_id",i); coll.insert(doc);
}
BasicDBObject predicate = new BasicDBObject();
long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate);
DBObject foundObj;
while (cur.hasNext()) {
foundObj = cur.next();
}
long endTime = System.currentTimeMillis();
What are else you measuring?
Each doc is is 4080 bytes
on disk with powerOf2Sizes
Measuring
• Time to parse &
execute query
• Time to retrieve all
document
But also
• Cost of shipping ~4MB
data through network
stack
Unrestricted predicate?
31. BasicDBObject predicate = new BasicDBObject();
predicate.put("_id", new BasicDBObject("$gte", 10).append("$lte", 20));
BasicDBObject projection = new BasicDBObject();
projection.put("_id", 1);
long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate, projection );
DBObject foundObj;
while (cur.hasNext()) {
foundObj = cur.next();
}
long endTime = System.currentTimeMillis();
Solution: Limit the projection
Return fixed range
32. BasicDBObject predicate = new BasicDBObject();
predicate.put("_id", new BasicDBObject("$gte", 10).append("$lte", 20));
BasicDBObject projection = new BasicDBObject();
projection.put("_id", 1);
long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate, projection );
DBObject foundObj;
while (cur.hasNext()) {
foundObj = cur.next();
}
long endTime = System.currentTimeMillis();
Solution: Limit the projection
Only project _id
Return fixed range
33. BasicDBObject predicate = new BasicDBObject();
predicate.put("_id", new BasicDBObject("$gte", 10).append("$lte", 20));
BasicDBObject projection = new BasicDBObject();
projection.put("_id", 1);
long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate, projection );
DBObject foundObj;
while (cur.hasNext()) {
foundObj = cur.next();
}
long endTime = System.currentTimeMillis();
Solution: Limit the projection
Only project _id
Only 46k transferred
through network stack
Return fixed range
36. The Physical Principles of the Quantum Theory (1930)
Werner Heisenberg
"Every experiment destroys some of the
knowledge of the system which was obtained by
previous experiments."
49. Ouch… where's the tree in the
woods?
• 2.4.10 -> 2.6.0
– 4495 git commits
50. git-bisect
• Bisect between good/bad hashes
• git-bisect nominates a new githash
– Build against githash
– Re-run test
– Confirm if this githash is good/bad
• Rinse and repeat
61. Response Times
Bulk Insert 2.6.0 vs 2.6.1
Ceiling similar, lower floor
resulting in 40%
improvement in throughput
62. Secondary effects of Yield policy change
Write lock time reduced
Order of magnitude reduction
of write lock duration
63. > db.serverStatus()
Yes – will cause a read lock to be acquired
> db.serverStatus({recordStats:0})
No – lock is not acquired
> mongostat
Yes - until SERVER-14008 resolved, uses db.serverStatus()
Unexpected side effects of
measurement?
64. CPU sampling
• Get an impression of
– Call Graphs
– CPU time spent on node and called nodes
66. > mongodb –dbpath <…>
Note: Do not use –fork
> mongo
> use admin
> db.runCommand({_cpuProfilerStart: {profileFilename: 'foo.prof'}})
Execute some commands that you want to profile
> db.runCommand({_cpuProfilerStop: 1})
Start the profiling
70. Public Benchmarks – Not all forks are
the same…
• YCSB
– https://github.com/achille/YCSB
• sysbench-mongodb
– https://github.com/mdcallag/sysbench-mongodb
72. Beavis & Butthead
"The future sucks. Change it."
"I'm way cool Beavis, but I cannot change the
future."
73. What we are working on
• mongo-perf
– UI refactor
– Adding more micro benchmarks
• Workloads
– Adding external benchmarks
– Creating benchmarks for common use cases
• Inbox fan in/out
• Analytical dashboards
• Stream / Feeds
• Customers, Partners & Community
74. Here's how you can help change the
future!
• Got a great workload? Great benchmark?
• Want to donate it?
• alvin@mongodb.com
75. Don't be that benchmark…
#1 Know what you are measuring
#2 Measure only what you need to
measure
Per Java7 documentation
http://docs.oracle.com/javase/7/docs/api/java/util/Random.html
"Instances of java.util.Random are threadsafe. However, the concurrent use of the same java.util.Random instance across threads may encounter contention and consequent poor performance. Consider instead using ThreadLocalRandom in multithreaded designs."
Per Java7 documentation
http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/atomic/package-summary.html
"The specifications of these methods enable implementations to employ efficient machine-level atomic instructions that are available on contemporary processors. However on some platforms, support may entail some form of internal locking."
Per Java7 documentation
http://docs.oracle.com/javase/7/docs/api/java/lang/System.html#currentTimeMillis()
"Returns the current time in milliseconds. Note that while the unit of time of the return value is a millisecond, the granularity of the value depends on the underlying operating system and may be larger. For example, many operating systems measure time in units of tens of milliseconds."
Per Jav7 documentation
http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ThreadLocalRandom.html
"A random number generator isolated to the current thread…Use of ThreadLocalRandom is particularly appropriate when multiple tasks (for example, each a ForkJoinTask) use random numbers in parallel in thread pools."
Per Java7 documentation
http://docs.oracle.com/javase/7/docs/api/java/lang/System.html#nanoTime()
"This method provides nanosecond precision, but not necessarily nanosecond resolution (that is, how frequently the value changes) - no guarantees are made except that the resolution is at least as good as that of currentTimeMillis()."