Mais conteúdo relacionado Semelhante a MongoDB from Basics to Scale (20) Mais de Moshe Kaplan (20) MongoDB from Basics to Scale1. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
MongoDBFor BillRun!
Copyrights © Moshe Kaplan
moshe.kaplan@brightaqua.com
2. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
MongoDB
For BillRun!
Moshe Kaplan
Scale Hacker
http://top-performance.blogspot.com
http://blogs.microsoft.co.il/vprnd
3. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
It’s all About
3
Scale
4. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
NOSQL. ANSWER A NEED
Introduction
4
5. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
5
6. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
The Consumer Revolution
6
http://topyaps.com/wp-content/uploads/2013/03/You-are-the-
product.-You-feeling-something.jpg
7. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
At the fraction of the cost…
7
8. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
8
http://lifehacker.com/5697167/if-youre-not-paying-for-it-
youre-the-product
9. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Transportation
9
10. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Moovit
10
11. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
The Medical Market Opportunities
11
12. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
MediSafe
12
13. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
13
14. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Askem
14
15. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Major Enablers:
Mobile, Cloud and IT Commoditization
15
16. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
The Prime Suspect
16
17. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
17
Assumptions…
18. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Where did it Fail?
Get an Answer, Fast and Cheap
19. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Where did it Fail?
I Just Want “Class Persistency
Storage” and Changing Schema on
Demand
20. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Where did it Fail?
Be Always Available, Even w/ an Old
Answer
21. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Where did it Fail?
Get Me Fast and Good Enough
Answer
22. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Where did it Fail?
Data is Too Big, and Storage is $$$
But CPU and Network are Even More
http://www.powerbyte.com/Isilon.html
23. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Software Providers
23
24. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
It is all great, but…
I Need to Meet Compliance
http://www.vision7.com/app_system/lib/image/content/PCI_compliance.jpg
25. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
It is all great, but…
I Need a Vendor
26. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
It is all great, but…
I Need Reporting
http://www.novell.com/communities/node/5851/get-ready-sentinel-61
27. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
It is all great, but…
I Need Transactions
http://www.novell.com/communities/node/5851/get-ready-sentinel-61
28. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
It is all great, but…
We Need Training for the Data Analysts
db.article.aggregate(
{ $group : {
_id : "$author",
docsPerAuthor : { $sum : 1 },
viewsPerAuthor : { $sum : "$pageViews" }
}}
);
< SUM(pageViews)
< SUM(1) = N
< GROUP BY author
29. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
NOSQL MARKET
Introduction
29
30. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
When Should I Choose NoSQL?
• Eventually Consistent
• Document Store
• Key Value
30
http://guyharrison.squarespace.com/blog/tag/nosq
31. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Key Value Store
• insert
• get
• multiget
• remove
• truncate
31
<Key, Value>
http://wiki.apache.org/cassandra/API
32. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Redis
• Very simple protocol (SMTP like)
• Amazing Performance (60Kqps ops on 1 CPU machine)
• Persistency to disk
• Very little security
33. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Column Family Stores:
Key Value Store (with benefits)
• insert
• get
• multiget
• remove
• truncate
33
http://wiki.apache.org/cassandra/API
34. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Cassandra
• Simple protocol
• Very Good Performance
• You have indexes (but limited)
• Data Model is a pain
• You need to design you data for queries:
“Table per Query”
35. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Document Databases
var mydoc = {
_id: ObjectId("5099803df3f4948bd2f98391"),
name: { first: "Alan", last: "Turing" },
birth: new Date('Jun 23, 1912'),
death: new Date('Jun 07, 1954'),
contribs: [
"Turing machine",
"Turing test",
"Turingery"
],
views : NumberLong(1250000)
}
35
36. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Database for Software Engineers
Class
Subclass
Document
Subdocument
37. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
37
MapReduce
http://blogs.microsoft.co.il/blogs/vprnd
38. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
HELLO. MY NAME IS MONGODB
Introduction
38
39. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
39
#5 Most Popular DB Engine
http://db-engines.com/en/ranking
40. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Who is Using mongoDB?
41. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Who is Behind mongoDB
42. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Why MongoDB?
What? Why?
JSON End to End
No Schema “No DBA”, Just Serialize
Write 10K Inserts/sec on virtual machine
Read Similar to MySQL
HA 10 min to setup a cluster
Sharding Out of the Box
GeoData Great for that
No Schema None: no downtime to create new columns
Buzz Trend is with NoSQL
42
43. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
What mongoDB is Made of?
43
http://www.10gen.com/products/mongodb
44. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Installation: Give Yourself 5min
• Add to /etc/yum.repos.d/10gen.repo
• [10gen]
• name=10gen Repository
• baseurl=http://downloads-distro.mongodb.org/repo/redhat/os/x86_64
• gpgcheck=0
• enabled=1
• yum –y install mongo-10gen mongo-10gen-server
• The Packages:
• mongo-10gen: tools
• mongo-10gen-server: mongod and mongos
45. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
The Ubuntu Way
• sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv
7F0CEB10
echo "deb http://repo.mongodb.org/apt/ubuntu trusty/mongodb-org/3.0
multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-3.0.list
sudo apt-get -y update
sudo apt-get install -y mongodb-org
46. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Installation w/ Authentication
• /etc/mongod.conf
• > mongo
• use admin
db.createUser(
{
user: "siteUserAdmin",
pwd: “Pss0rdxxx",
roles: [ { role: "userAdminAnyDatabase", db: "admin" } ]
} )
47. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Mastering a New Query
Language
48. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Connect to the Database
• Connect:
• > mongo
• Show current database:
• >> db
• Show Databases
• >> show databases;
• Show Collections
• >> show collections; or show tables;
49. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Databases Manipulation: Create & Drop
• Change Database:
• >> use <database>
• Create Database
• Just switch and create an object…
• Delete Database
• > use mydb;
• > db.dropDatabase();
50. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Collections Manipulation
• Create Collcation
>db.createCollection(collectionName)
• Delete Collection
> db.collectionName.drop()
Or just insert to it
51. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
SELECT: No SQL, just ORM…
• Select All
• db.things.find()
• WHERE
• db.posts.find({“comments.email” : ”b@c.com”})
• Pattern Matching
• db.posts.find( {“title” : /mongo/i} )
• Sort
• db.posts.find().sort({email : 1, date : -1});
• Limit
• db.posts.find().limit(3)
52. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
NoSQL and Data Modeling
What is the Difference
53. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Database for Software Engineers
Class
Subclass
Document
Subdocument
54. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Same Terminology
• Database Database
• Table Collection
• Row Document
55. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
A Blog Case Study in MySQL
http://www.slideshare.net/nateabele/building-apps-with-mongodb
56. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
as a SW Engineer would like it to be…
http://www.slideshare.net/nateabele/building-apps-with-mongodb
57. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Migration from RDBMS to NoSQL
How to do that?
58. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Data Migration
• Map the table structure
• Export the data and Import It
• Add Indexes
58
http://igcse-geography-lancaster.wikispaces.com/1.2+MIGRATION
59. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Selected Migration Tool
59
60. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Usage Details> Install ruby
> gem install mongify
… Modify the code to your needs
… Create configuration files
> mongify translation db.config >
translation.rb
> mongify process db.config translation.rb
60
61. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Date Functions
• Year(), Month()… function included
• … buy only in the JavaScript engine
• Solution: New fields!
• [original field]
• [original field]_[year part]
• [original field]_[month part]
• [original field]_[day part]
• [original field]_[hour part]
61
62. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
NO SCHEMA IS A GOOD THING BUT…
Schemaless
62
63. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Default Values
• No Schema
• No Default Values
• App Challenge
• Timestamps…
No single source of truth
63
64. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Casting and Type Safety
• No Schema
• No …
• App Challenge
64
65. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Auto Numbers
• Start using _id
{
"_id" : 0,
"health" : 1,
"stateStr" : "PRIMARY",
"uptime" : 59917
}
• Counter tables
• Dedicated database
• 1:1 Mapping
• Counter++ using findAndModify
65
66. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
ORM Solution
66
67. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Data Analysts
67
http://www.designersplayground.com/pr/internet-meme-list/data-analyst-2/
68. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Data Analysts
• This is not SQL
• There are no joins
• No perfect tools
68
Pentaho
RockMongoMongoVUE RoboMongo
69. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
No Joins
• Do in the application
• Leverage the power of NoSQL
69
http://www.slideshare.net/nateabele/building-apps-with-mongodb
70. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Limited Resultset
70
• 16MB document size
• GridFS
71. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Bottom Line
• Powerful tool
• Embrace the Challenge
• Schema-less limitations: counters, data types
• Tools for Data Scientists
• Data design
71
72. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Billing Data Model
73. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Design Model
• balances
• bills
• lines
• plans
• queue
• rates
• subscribers
• users
74. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Mastering a New Query
Language
75. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Connect to the Database
• Connect:
• > mongo
• Show current database:
• >> db
• Show Databases
• >> show databases;
• Show Collections
• >> show collections; or show tables;
76. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Databases Manipulation: Create & Drop
• Change Database:
• >> use <database>
• Create Database
• Just switch and create an object…
• Delete Database
• > use mydb;
• > db.dropDatabase();
77. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Collections Manipulation
• Create Collcation
>db.createCollection(collectionName)
• Delete Collection
> db.collectionName.drop()
Or just insert to it
78. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
SELECT: No SQL, just ORM…
• Select All
• db.things.find()
• WHERE
• db.posts.find({“comments.email” : ”b@c.com”})
• Pattern Matching
• db.posts.find( {“title” : /mongo/i} )
• Sort
• db.posts.find().sort({email : 1, date : -1});
• Limit
• db.posts.find().limit(3)
79. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Specific fields
Select All
db.users.find(
{ },
{ user_id: 1, status: 1, _id: 0 }
)
1: Show; 0: don’t show
80. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
WHERE
• != “A” { $ne: "A" }
• > 25 { $gt: 25 }
• > 25 AND <= 50 { $gt: 25, $lte: 50 }
• Like ‘bc%’ /^bc/
• < 25 OR >= 50 { $or : [ { $lt: 25 }, { $gte : 50 } ] }
81. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Join
• Wrong Place…
• Or Map Reduce
82. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
82
db.article.aggregate(
{ $group : {
_id : { author : "$author“, name : “$name” },
docsPerAuthor : { $sum : 1 },
viewsPerAuthor : { $sum : "$pageViews" }
}}
);
GROUP BY
< GROUP BY author, name
< SUM(pageViews)
< SUM(1) = N
83. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
83
db.Movie.aggregate([
{$match:
{SeriesType : "F", MovieID : {$in : arrMovies}}
},
{$project:
{MovieID: "$MovieID", SeriesType: "$SeriesType",
Genres: "$Genres"}
},
{$unwind : "$Genres" },
{$group : { _id : "$Genres" , count : { $sum : 1 } } },
{$sort : { count: -1 }}
GROUP BY
WHERE
Keep some fields
Genres is an array
Counting and sorting
84. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Aggregation Framework Operators
Operator Description
$project Adding/Removing fields
$match WHERE
$redact Changes document based on Doc content/structure
$limit First N documents
$skip Skips N docs
$unwind Turns array into a multiple documents
$group Group
$sort Sort
$geoNear Geo spatial
$out Write Output to collection
85. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
85
db.posts.update(
{“comments.email”: ”b@c.com”},
{$set : {“comments.email”: ”d@c.com”}}
}
SET age = age + 3
• db.users.update(
• { status: "A" } ,
• { $inc: { age: 3 } },
• { multi: true }
• )
UPDATE
86. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
86
j = { name : "mongo" }
k = { x : 3 }
db.things.insert( j )
db.things.insert( k )
INSERT
87. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
87
db.users.remove(
{ status: "D" }
)
DELETE
88. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
88
Every operation on a document is atomic
Two Phase Commit implementation is up to
you
Atomic Transactions: Single Row
89. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
89
Multiple documents at once
db.foo.update(
{ status : "A" , $isolated : 1 },
{ $inc : { count : 1 } },
{ multi: true }
)
Disclaimers:
• Sharding is not supported
• Not all or nothing (no roll back on failure)
Atomic Transactions: $isolated
90. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
90
t = db.transactions.findAndModify({
query: {
state: "initial“
},
update: {
$set: {
state: "pending"
},
$currentDate: { lastModified: true }
},
new: true
})
Atomic Transactions: findAndModify
91. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
91
If it is about complex transactions.
Simplify the case.
or Consider keeping w/ RDBMS
Atomic Transactions: Bottom Line
92. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
92
Failure and order:
• db.collection.initializeOrderedBulkOp()
• db.collection.initializeUnorderedBulkOp()
1000 ops/bulk:
var bulk = db.items.initializeUnorderedBulkOp();
bulk.insert( { item: "abc123", defaultQty: 100, status: "A", points: 100 } );
bulk.insert( { item: "ijk123", defaultQty: 200, status: "A", points: 200 } );
bulk.insert( { item: "mop123", defaultQty: 0, status: "P", points: 0 } );
bulk.execute();
Bulk Operations
93. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
93
Create a new project
Get the Maven configuration for MongoDB Java Driver
• http://mongodb.github.io/mongo-java-driver/
Project Setup
94. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
94
List l = new ArrayList();
/**** Insert ****/
// create a document to store key and value
for (int i = 1; i < 1000000; ++i) {
Document document = new Document()
.append("name", "Moshe Kaplan")
.append("age", 36 + i)
.append("createdDate", new Date());
l.add(document);
}
table.insertMany(l);
Bulk Ops in Java
95. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
95
List<String> continentList = Arrays.asList(new String[]{"Africa", "Europe", "Asia"});
DBObject match = new BasicDBObject("$match", new BasicDBObject("continent.name", new BasicDBObject("$in",
continentList)));
DBObject projectFields = new BasicDBObject("continent.name", 1);
projectFields.put("area", 1);
projectFields.put("_id", 0);
DBObject project = new BasicDBObject("$project", projectFields );
DBObject groupFields = new BasicDBObject( "_id", "$continent.name");
groupFields.put("average", new BasicDBObject( "$avg", "$area"));
DBObject group = new BasicDBObject("$group", groupFields);
List agList = new ArrayList();
agList.add(match);
agList.add(project);
agList.add(group);
MongoCursor<Document> cursor = countries.aggregate(agList).iterator();
while (cursor.hasNext()) {
System.out.println(cursor.next());
}
Aggregation Framework in Java
96. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Performance Tuning
Make a Change
97. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
MONGODB TUNING
97
98. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
journalCommitInterval = 300:
Write to disk: 2ms <= t <= 300ms
Default 100ms, increase to 300ms to save resources
Disk
The Journal
98
Memory
Journal Data
1 2
99. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
RAM Optimization:
dataSize + indexSize < RAM
99
OS
Data Index
Journal
100. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
PROFILING AND SLOW LOG
100
101. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Profiling Configuration
• Enable:
• mongod --profile=1 --slowms=15
• db.setProfilingLevel([level] , [time])
• How much:
• 0 (none) 1 (slow queries only) 2 (all)
• 100ms: default
• Where:
• system.profile collection @ local db
101
102. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Profiling Results Analysis
• Last 5 >1ms: show profile
• w/o commands:
db.system.profile.find( { op: { $ne : 'command' } } ).pretty()
• Specific database:
db.system.profile.find( { ns : 'mydb.test' } ).pretty()
• Slower than:
db.system.profile.find( { millis : { $gt : 5 } } ).pretty()
• Between dates:
db.system.profile.find({ts : {
$gt : new ISODate("2012-12-09T03:00:00Z") ,
$lt : new ISODate("2012-12-09T03:40:00Z")
}}).pretty()
102
103. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Explain
> db.courses.find().explain();
{ "cursor" : "BasicCursor",
"isMultiKey" : false,
"n" : 11, “nscannedObjects" : 11, "nscanned" : 11,
"nscannedObjectsAllPlans" : 11, "nscannedAllPlans" : 11,
"scanAndOrder" : false, "indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 0,
"indexBounds" : {},
"server" : "primary.domain.com:27017"
}
103
104. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
INDEXES
104
105. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Index Management
• Regular Index
• db.users.createIndex( { user_id: 1 } )
• db.users.ensureIndex( { user_id: 1 } )
• Multiple + DESC Index
• db.users.ensureIndex( { user_id: 1, age: -1 } )
• Sub Document Index
• db.users.ensureIndex( { address.zipcode: 1 } )
• Unique Index
• db.users.ensureIndex( { address.zipcode: 1 } , { unique : true } )
• List Indexes
• db.users.getIndexes()
• Drop Indexes
• db.users.dropIndex(“indexName”)
105
106. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Known Index Issues
• Bound filter should be the last (in the index as well).
• BitMap Indexes not really working
• You should design your indexes carefully
106
107. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Dex: The Index Analyzer
• Installation:
• sudo apt-get -y install python-pip
sudo pip install dex
• Running:
• dex [mongodb_uri] (-f <logfile_path> | -p) [<options>]
• dex -w -p -n "testdb.*" mongodb://127.0.0.1/testdb -f
/var/log/mongodb/mongod.log
108. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
mtools: Visualize and Analyze Logs
109. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Capped Collections
• Fixed size collections
• Circular buffers like
• High throughput operations
• Order guarantee
db.createCollection("mycoll", {capped: true, size:100000})
db.cappedCollection.find().sort( { $natural: -1 } )
• Case studies:
• Logs
• Cache
110. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
TTL
• Remove Old Data Automatically
• db.log_events.createIndex(
{ "createdAt": 1 }, { expireAfterSeconds: 3600 }
)
• db.log_events.insert( {
"expireAt": new Date('July 22, 2013 14:00:00'),
"logEvent": 2,
"logMessage": "Success!“
} )
111. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
ENVIRONMENT TUNING
111
112. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
TTL
• # For SSD only
• blockdev --setra 16 /dev/sdb
• blockdev --setra 16 /dev/dm-2
• # For all cluser mongod & mongos
• for i in /sys/kernel/mm/*transparent_hugepage/enabled;
do echo never > $i; done
• for i in /sys/kernel/mm/*transparent_hugepage/defrag;
do echo never > $i; done
113. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
STATS &
SCHEMA DESIGN
113
114. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Sparse Matrix? I don’t Think so
• mongostat
• > db.stats();
• > db.collectionname.stats();
• Fragmentation if storageSize/size > 2
• db.collectionanme.runCommand(“compact”)
• Padding (wrong design) if paddingFactor > 2
114
115. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
High Availability
Going Real Time
116. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
(Do Not) Master/Slave
117. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
• In mongo.conf
• # Replication Options
• replSet=myReplSet
• > rs.initiate()
• > rs.conf()
• > rs.add(“host:port")
• rs.reconfig()
Replication Set
117
118. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
• rs.addArb(“host:port")
• Also:
• Low Priority
• Hidden
• (Weighted) Voting
Arbiter
118
119. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Show Status: rs.status();
• {"set" : “myReplSet", "date" : ISODate("2013-02-05T10:23:28Z"),
• "myState" : 1,
• "members" : [
• {
• "_id" : 0, "name"
: "primary.example.com:27017",
• "health" : 1, "state" :
1,
• "stateStr" : "PRIMARY",
"uptime" : 164545,
• "optime" : Timestamp(1359901753000, 1),
• "optimeDate" : ISODate("2013-02-
03T14:29:13Z"), "self" : true
• },
• {
• "_id" : 1, "name"
120. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Replica Set Recovery
• Create a new mongod
• Either install a plain vanilla
• Or duplicate existing mongod (better)
• Connect to the system
• Use the previous machine IP
• Or change configuration to remove old and add new
121. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Sharding and Scale out:
Make a big Change
Map Reduce and Aggregation
122. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Secondary Read Enabling
123. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
The Strategy : Sharding
124. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
MongoDB Implementation
125. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Step 1: Create a Config ReplicaSet
• mkdir /data/configdb
• mongod --configsvr --dbpath /data/configdb --port 27019
126. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Step 2: Install Mongos
• mongos --configdb config01:27019, config02:27019, config03:27019
127. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Step 3: Add Shards
• Connect a mongos
• Add Shard
• sh.addShard( "rs1/mongodb0.example.net:27017" )
• sh.addShard( "mongodb0.example.net:27017" )
128. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Step 4: Enable Sharding
• sh.enableSharding("<database>")
129. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Step 5: Sharding Colleciton
• sh.shardCollection("<database>.<collection>", shard-key-pattern)
• sh.shardCollection("records.people", { "zipcode": 1, "name": 1 } )
• Keys:
• High Cardinality to enable split
• Use common query field
• Use Compound indexes for sharding
130. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
BACKUP AND MONITORING
130
131. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
First Option – Single Server
Logical Backup Physical Backup
Method mongodump Point in time snapshot (using LVM tools) or disk image/copy
(using AWS or Azure “external” tools)
Pros Low costs Low costs
Cons • Downtime: Long;
• Duration: Long (slow backup since logical data needs to be
extracted);
• Performance impact: High (slows the disks and may stuck the
machine on heavy used machines);
• Data consistency: Intact;
• Differential: Supported;
• Sharding: Supported;
• Downtime: OS and/or infrastructure depended;
• Duration: Short (faster backup since only data blocks are
copied);
• Performance impact: Unknown (depends on OS and/or
infrastructure);
• Data consistency: Unknown state;
• Differential: Infrastructure depended;
• Sharding: Unsupported;
131
Sharding: is a type of database partitioning that separates very large databases
the into smaller, faster, more easily managed parts called data shards. The word
shard means a small part of a whole..
132. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
SECOND Option – REPLICA SET
Logical Backup Physical Backup
Method mongodump Stop slave and copy its disk
Pros • Downtime: None (backup is performed using Slave server –
Master server is always up);
• Duration: Not significant (backup is performed using Slave
server);
• Performance impact: None (backup is performed using Slave
server – Master server is not impacted);
• Data consistency: Intact;
• Differential: Supported;
• Sharding: Supported;
• Downtime: None
• Duration: Not significant
Cons Very high costs – requires two additional servers. A slave
server of the same type and size as the master server; and a
small arbiter server (used as a secondary verification for
Master server availability tests and “voting”).
• Costs: Requires a dedicated server per replica set
132
133. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
THIRD OPTION - MongoDB MMS
• Part of the MongoDB Enterprise Edition or as a Cloud Service
• The Cloud Service offer
• $50/month/node
• $2.5/GB/Month backup.
• A valid go to market way of MongoDB
for upsale
• MMS Features
• Point in time recovery
• Daily snapshots
• Detailed monitoring
• Alerts
133
134. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
How to Enable Incremental Backup
• In Backup
• Use the --oplog flag when doing mongodump
• Dump each hour the local.oplog collection
• In recovery
• mongorestore --oplogReplay
• applyOps to implement hourly dump
134
135. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
mongostat
136. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
mongotop
137. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
db.serverStatus()
138. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
db.stats() and db.collection.stats()
139. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
rs.status()
140. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
STORAGE ENGINES
140
141. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
MMAPv1
142. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
MongoDB 3.0 and WiredTiger
• MongoDB version 3.0 supports new storage engine
(WiredTiger):
• Disk Compression
• Heavy write
• Document level locking
• File per collection
• Server wide selection:
• config.yaml
• launch w/ --storageEngine = wiredTiger
142
143. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
MongoDB Pluggable Architecture
144. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Engines Comparison
145. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
YAML Based Configuration
storage:
dbPath: "/var/lib/mongodbwt"
directoryPerDB: true
engine: "wiredTiger"
wiredTiger:
engineConfig:
cacheSizeGB: 16
journalCompressor: zlib
directoryForIndexes: true
collectionConfig:
blockCompressor: zlib
indexConfig:
prefixCompression: true
systemLog:
destination: file
path: "/var/log/mongodb/mongod.log"
logAppend: true
timeStampFormat: iso8601-local
processManagement:
fork: true
pidFilePath: "/var/run/mongodb.pid"
#security:
# keyFile: "/etc/mongo.key"
# authorization: "enabled"
replication:
replSetName: "arp0"
146. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
SECURITY
146
147. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Providing Permissions
• use admin
db.createUser( {
user: "siteUserAdmin", pwd: "password",
roles: [ { role: "userAdminAnyDatabase", db: "admin" } ]
} )
• use records
db.createUser( {
user: "recordsUserAdmin", pwd: "password",
roles: [ { role: "userAdmin", db: "records" } ]
} )
148. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Roles
Read
readWrite
dbAdmin
dbOwner
userAdmin
clusterAdmin, clusterManager, …
backup, restore
readAnyDatabase, readWriteAnyDatabase
root
149. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Granular Actions
use admin
db.createRole(
role: "manageOpRole",
privileges: [
{ resource: { cluster: true }, actions: [ "killop", "inprog" ] },
{ resource: { db: "", collection: "" }, actions: [ "killCursors" ] }
],
roles: []
}
)
150. © All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!© All rights reserved: Moshe Kaplan© All rights reserved: Moshe Kaplan MongoDB for BillRun!
Thank You !
Moshe Kaplan
moshe.kaplan@brightaqua.com
054-2291978