SlideShare uma empresa Scribd logo
1 de 204
Baixar para ler offline
MongoDB
By wiTTyMindsTech
MongoDB Overview
MongoDB is a cross-platform document-oriented database program. Classified as a NoSQL
database program, MongoDB uses JSON-like documents with schema. MongoDB is
developed by MongoDB Inc. and licensed under the Server Side Public License.
EMPLOYEE
NAME ID COMPANY
NoSQL DB Overview
Unlike MySQL and Oracle Databases.
NoSQL Deals in files .
NoSQL, which stand for "not only SQL," is an alternative to traditional relational databases in which
data is placed in tables and data schema is carefully designed before the database is built. NoSQL
databases are especially useful for working with large sets of distributed data.
JSON Introduction
JSON (JavaScript Object Notation) is a lightweight data-interchange format. It is easy for humans to
read and write. It is easy for machines to parse and generate. It is based on a subset of the JavaScript
Programming Language Standard ECMA-262 3rd Edition - December 1999.
.
JSON Introduction
.
JSON Data types
At the granular level, JSON consist of 6 data types. First four data types (string, number, boolean and null)
can be referred as simple data types. Other two data types (object and array) can be referred as complex
data types.
1. string
2. number
3. boolean
4. null/empty
5. object
6. array
7.
MongoDB Overview
MongoDB Overview
Document Database
A record in MongoDB is a document, which is a data structure composed of field and value pairs. MongoDB
documents are similar to JSON objects. The values of fields may include other documents, arrays, and
arrays of documents.
MongoDB Collections/Views
MongoDB stores documents in collections. Collections are analogous to tables in relational databases.
In addition to collections, MongoDB supports:
● Read-only Views (Starting in MongoDB 3.4)
● On-Demand Materialized Views (Starting in MongoDB 4.2
MongoDB Overview
MongoDB is free and open source database.
It is a NoSQL database use Json like schema.
It is a cross platform database and very easy to deploy on the cloud and server.
It is of the most important databases you can work with these days.
MongoDB makes working with data simple.
It prioritizes performance and efficiency.
It is very popular. MongoDB developers are in high demand.
MongoDB CRUD
● How to perform CRUD (Create, Read, Update, Delete) operations on
MongoDB databases
MongoDB Create
● db.collection.insertOne() New in version 3.2
● db.collection.insertMany() New in version 3.2
MongoDB Read
db.collection.find()
You can specify query filters or criteria that identify the documents to return.
MongoDB Update
● db.collection.updateOne() New in version 3.2
● db.collection.updateMany() New in version 3.2
● db.collection.replaceOne() New in version 3.2
MongoDB Delete
Delete operations remove documents from a collection. MongoDB provides the following methods to delete
documents of a collection:
● db.collection.deleteOne() New in version 3.2
● db.collection.deleteMany() New in version 3.2
MongoDB Objectives
● How to perform CRUD (Create, Read, Update, Delete) operations on MongoDB databases
● How to filter for data efficiently
● How to work with both the Mongo Shell and drivers (e.g. Node.js driver)
● How to increase performance by using indexes (and how to use the right indexes!)
● How to use the amazing “Aggregation Framework” that’s built into MongoDB
● What replica sets and sharding are
● How to use MongoDB Atlas — the cloud solution offered by MongoDB
● How to use the serverless platform (Stitch) offered by MongoDB
MongoDB Advantages
● Schema less − MongoDB is a document database in which one collection holds different documents. Number of fields,
content and size of the document can differ from one document to another.
● Structure of a single object is clear.
● No complex joins.
● Deep query-ability. MongoDB supports dynamic queries on documents using a document-based query language that's nearly
as powerful as SQL.
● Tuning.
● Ease of scale-out − MongoDB is easy to scale.
● Conversion/mapping of application objects to database objects not needed.
● Uses internal memory for storing the (windowed) working set, enabling faster access of data.
● Scalability
●
MongoDB Objectives
● Download and install MongoDB
● Modify environment variables
● Start and stop mongoDB server
● Connect to MongoDB using python
● Perform CRUD operations
MongoDB Objectives
● Create a database
● Create a collection
● Insert documents
● Combine collections
● Import data into mongoDB
● Backup and restore mongoDB
● Create indexes
Why Use MongoDB?
● Document Oriented Storage − Data is stored in the form of JSON style documents.
● Index on any attribute
● Replication and high availability
● Auto-sharding
● Rich queries
● Fast in-place updates
● Professional support by MongoDB
Where to use MongoDB?
● Big Data
● Content Management and Delivery
● Mobile and Social Infrastructure
● User Data Management
● Data Hub
Where to use MongoDB?
● Big Data
● Content Management and Delivery
● Mobile and Social Infrastructure
● User Data Management
● Data Hub
MongoDB Data Modelling
Suppose a client needs a database design for his blog/website and see the differences between RDBMS and MongoDB
schema design. Website has the following requirements.
● Every post has the unique title, description and url.
● Every post can have one or more tags.
● Every post has the name of its publisher and total number of likes.
● Every post has comments given by users along with their name, message, data-time and likes.
● On each post, there can be zero or more comments
MongoDB Data Modelling
MongoDB Structure
{
_id: POST_ID
title: TITLE_OF_POST,
description: POST_DESCRIPTION,
by: POST_BY,
url: URL_OF_POST,
tags: [TAG1, TAG2, TAG3],
likes: TOTAL_LIKES,
comments: [
{
user:'COMMENT_BY',
message: TEXT,
dateCreated: DATE_TIME,
like: LIKES
},
{
user:'COMMENT_BY',
message: TEXT,
dateCreated: DATE_TIME,
like: LIKES
}
]
}
MongoDB Installation
$sudo apt-get update
$sudo apt install -y mongodb
$sudo systemctl status mongodb
$mongo --eval 'db.runCommand({ connectionStatus: 1 })'
$mongo
>show databases
>use mydb
>show collections
MongoDB Conf file and Settings
/etc/mongoDB.conf
A given mongo database is broken up into a series of BSON files on disk, with increasing size up to 2GB. BSON is its
own format, built specifically for MongoDB.
MongoDB login
$ mongo
Will try to connect to localhost
Using ip address and port
$ mongo --host 192.168.64.24:27017
Or
$mongo --host 192.168.64.24 --port 27017
MongoDB User Creation
1. Set up your user
use cool_db
db.createUser({
user: 'john',
pwd: 'secretPassword',
roles: [{ role: 'readWrite', db:'test'}]
})
MongoDB User Creation
mongo -u john -p secretPassword 192.168.64.24/test
MongoDB shell version v3.6.3
connecting to: mongodb://192.168.64.24:27017/test
MongoDB server version: 3.6.3
> db.inv.insert({"name":"abc1", "add":"street3"})
WriteResult({ "nInserted" : 1 })
MongoDB User Creation
1. Set up your user for read only
use cool_db
db.createUser({
user: 'Sya',
pwd: 'secretPassword',
roles: [{ role: 'read', db:'test'}]
})
MongoDB User Creation
$ mongo -u Syam -p secretPassword 192.168.64.24/test
MongoDB shell version v3.6.3
connecting to: mongodb://192.168.64.24:27017/test
MongoDB server version: 3.6.3
> db.inv.insert({"name":"abc1", "add":"street3"})
WriteResult({
"writeError" : {
"code" : 13,
"errmsg" : "not authorized on test to execute command { insert: "inv", ordered:
true, $db: "test" }"
}
})
MongoDB User login
$ mongo -u john -p secretPassword 192.168.64.24/test
MongoDB security authorization
security:
authorization: 'enabled'
MongoDB security authorization
Create a user before uncommenting auth=true in mongodb.conf
db.createUser(
... {
... user: "myUserAdmin",
... pwd: "abc123",
... roles: [ { role: "userAdminAnyDatabase", db: "admin" } ]
... }
... )
Again uncomment the auth=true and restart mongodb
mongo --host 192.168.64.24 --port 27017 -u myUserAdmin -p --authenticationDatabase admin
MongoDB Create DB
MongoDB use DATABASE_NAME is used to create database. The command will create a new database if it
doesn't exist, otherwise it will return the existing database.
If you didn't create any database, then collections will be stored in test database.
>use mydb
switched to db mydb
>db.dropDatabase()
MongoDB Create
Create collection explicitly
● db.createCollection("mycol", { capped : true, autoIndexId : true, size :
● 6142800, max : 10000 } )
MongoDB Data type
Int
Boolean
Float
String
etc
MongoDB Create Collection
The cool thing about MongoDB is that you need not to create collection before you insert document in it.
With a single command you can insert a document in the collection and the MongoDB creates that
collection on the fly.
Syntax: db.collection_name.insert({key:value, key:value…})
MongoDB Create Collection
db.beginnersbook.insert({
name: "Chaitanya",
age: 30,
website: "beginnersbook.com"
})
db.collection_name.insert
Find the Details
MongoDB Insert
db.post.insert([
{
title: 'MongoDB Overview',
description: 'MongoDB is no sql database',
by: 'tutorials point',
url: 'http://www.tutorialspoint.com',
tags: ['mongodb', 'database', 'NoSQL'],
likes: 100
},
{
title: 'NoSQL Database',
description: "NoSQL database doesn't have tables",
by: 'tutorials point',
url: 'http://www.tutorialspoint.com',
tags: ['mongodb', 'database', 'NoSQL'],
likes: 20,
comments: [
{
user:'user1',
message: 'My first comment',
dateCreated: new Date(2013,11,10,2,35),
like: 0
}
]
}
])
MongoDB Collection insert methods
db.collection.insertOne()
Inserts a single document into a collection.
db.collection.insertMany()
db.collection.insertMany() inserts multiple documents into a collection
db.collection.insert()
db.collection.insert() inserts a single document or multiple documents into a collection
MongoDB Collection insert
The insert() Method
db.COLLECTION_NAME.insert(document)
Example:
db.mycol.insert({
_id: ObjectId(7df78ad8902c),
title: 'MongoDB Overview',
description: 'MongoDB is no sql database',
by: 'tutorials point',
url: 'http://www.tutorialspoint.com',
tags: ['mongodb', 'database', 'NoSQL'],
likes: 100
})
MongoDB Collection insert multiple
We can pass array of document also
:
db.post.insert([
{
title: 'MongoDB Overview',
description: 'MongoDB is no sql database',
by: 'tutorials point',
url: 'http://www.tutorialspoint.com',
tags: ['mongodb', 'database', 'NoSQL'],
likes: 100
},
{
title: 'NoSQL Database',
description: "NoSQL database doesn't have tables",
by: 'tutorials point',
url: 'http://www.tutorialspoint.com',
tags: ['mongodb', 'database', 'NoSQL'],
likes: 20,
comments: [
{
user:'user1',
message: 'My first comment',
dateCreated: new Date(2013,11,10,2,35),
like: 0
}
]
MongoDB Collection find method
db.COLLECTION_NAME.find()
find() method will display all the documents in a non-structured way.
The pretty() Method
To display the results in a formatted way, you can use pretty() method.
Syntax
>db.mycol.find().pretty()
MongoDB Collection InsertOne
db.products.insertOne( { item: "card", qty: 15 } );
Products will be created automatically.
MongoDB Collection find method
>db.mycol.find().pretty()
{
"_id": ObjectId(7df78ad8902c),
"title": "MongoDB Overview" ,
"description": "MongoDB is no sql database" ,
"by": "tutorials point" ,
"url": "http://www.tutorialspoint.com" ,
"tags": ["mongodb", "database", "NoSQL"],
"likes": "100"
}
>
MongoDB Collection insert data
db.inventory.insertMany([
{ item: "journal", qty: 25, size: { h: 14, w: 21, uom: "cm" },
status: "A" },
{ item: "notebook", qty: 50, size: { h: 8.5, w: 11, uom: "in" },
status: "A" },
{ item: "paper", qty: 100, size: { h: 8.5, w: 11, uom: "in" },
status: "D" },
{ item: "planner", qty: 75, size: { h: 22.85, w: 30, uom: "cm" },
status: "D" },
{ item: "postcard", qty: 45, size: { h: 10, w: 15.25, uom: "cm" },
status: "A" }
]);
MongoDB Collection select data
db.inventory.find( {} )
This operation corresponds to the following SQL statement:
SELECT * FROM inventory
db.inventory.find( { status : "D" } )
This operation corresponds to the following SQL statement:
SELECT * FROM inventory WHERE status = "D"
MongoDB Collection select data
The following example retrieves all documents from the inventory collection where status equals either "A"or
"D":
db.inventory.find( { status: { $in: [ "A", "D" ] } } )
Select * from inventory where status in ( “A”,”D”)
MongoDB Collection select data
The following example retrieves all documents from the inventory collection where status equals either "A"or
"D":
db.inventory.find( { status: { $nin: [ "A", "D" ] } } )
Select * from inventory where status not in ( “A”,”D”)
MongoDB Collection select data
#count the number of resultant rows
db.inventory.count( { status: { $nin: [ "A", "D" ] } } )
db.inventory.find( { status: "A", qty: { $lt: 30 } } )
The operation corresponds to the following SQL statement:
SELECT * FROM inventory WHERE status = "A" AND qty < 30
OR in MongoDB
Using the $or operator, you can specify a compound query that joins each clause with a logical OR conjunction so
that the query selects the documents in the collection that match at least one condition.
db.inventory.find( { $or: [ { status: "A" }, { qty: { $lt: 30 } } ] } )
SELECT * FROM inventory WHERE status = "A" OR qty < 30
db.inventory.find( { $and: [ { status: "A" }, { qty: { $lt: 30 } } ] } )
Exporting mongodb collection in json file
$ sudo mongoexport --host 192.168.64.24 --db config --collection inventory --out
/home/parallels/status.json
OR in MongoDB
To query documents based on the OR condition, you need to use $or keyword. Following is the basic syntax of OR −
>db.mycol.find(
{
$or: [
{key1: value1}, {key2:value2}
]
}
).pretty()
Using the $or operator, you can specify a compound query that joins each clause with a logical OR conjunction so
that the query selects the documents in the collection that match at least one condition.
OR in MongoDB
Following example will show all the tutorials written by 'tutorials point' or whose title is 'MongoDB Overview'.
>db.mycol.find({$or:[{"by":"tutorials point"},{"title": "MongoDB Overview"}]}).pretty()
{
"_id": ObjectId(7df78ad8902c),
"title": "MongoDB Overview",
"description": "MongoDB is no sql database",
"by": "tutorials point",
"url": "http://www.tutorialspoint.com",
"tags": ["mongodb", "database", "NoSQL"],
"likes": "100"
}
>
AND and OR in MongoDB
where likes>10 AND (by = 'tutorials point' OR title = 'MongoDB Overview')'
>db.mycol.find({"likes": {$gt:10}, $or: [{"by": "tutorials point"},
{"title": "MongoDB Overview"}]}).pretty()
{
"_id": ObjectId(7df78ad8902c),
"title": "MongoDB Overview",
"description": "MongoDB is no sql database",
"by": "tutorials point",
"url": "http://www.tutorialspoint.com",
"tags": ["mongodb", "database", "NoSQL"],
"likes": "100"
}
insertMany
db.collection.insertMany(
[ <document 1> , <document 2>, ... ],
{
writeConcern: <document>,
ordered: <boolean>
}
)
insertMany check null
db.inventory.insertMany([
{ _id: 1, item: null },
{ _id: 2 }
])
db.inventory.find( { item: null } )
The { item : { $type: 10 } } query matches only documents that contain the item field whose value
isnull; i.e. the value of the item field is of BSON Type Null (type number 10) :
db.inventory.find( { item : { $type: 10 } } )
insertMany check null
db.inventory.find( { item : { $exists: false } } )
MongoDB Update
● db.collection.updateOne(<filter>, <update>, <options>)
● db.collection.updateMany(<filter>, <update>, <options>)
● db.collection.replaceOne(<filter>, <update>, <options>)
db.inventory.insertMany( [
{ item: "canvas", qty: 100, size: { h: 28, w: 35.5, uom: "cm" }, status: "A" },
{ item: "journal", qty: 25, size: { h: 14, w: 21, uom: "cm" }, status: "A" },
{ item: "mat", qty: 85, size: { h: 27.9, w: 35.5, uom: "cm" }, status: "A" },
{ item: "mousepad", qty: 25, size: { h: 19, w: 22.85, uom: "cm" }, status: "P" },
{ item: "notebook", qty: 50, size: { h: 8.5, w: 11, uom: "in" }, status: "P" },
{ item: "paper", qty: 100, size: { h: 8.5, w: 11, uom: "in" }, status: "D" },
{ item: "planner", qty: 75, size: { h: 22.85, w: 30, uom: "cm" }, status: "D" },
{ item: "postcard", qty: 45, size: { h: 10, w: 15.25, uom: "cm" }, status: "A" },
{ item: "sketchbook", qty: 80, size: { h: 14, w: 21, uom: "cm" }, status: "A" },
{ item: "sketch pad", qty: 95, size: { h: 22.85, w: 30.5, uom: "cm" }, status: "A" }
] );
MongoDB Update
db.inventory.insertMany( [
{ item: "canvas", qty: 100, size: { h: 28, w: 35.5, uom: "cm" }, status: "A" },
{ item: "journal", qty: 25, size: { h: 14, w: 21, uom: "cm" }, status: "A" },
{ item: "mat", qty: 85, size: { h: 27.9, w: 35.5, uom: "cm" }, status: "A" },
{ item: "mousepad", qty: 25, size: { h: 19, w: 22.85, uom: "cm" }, status: "P" },
{ item: "notebook", qty: 50, size: { h: 8.5, w: 11, uom: "in" }, status: "P" },
{ item: "paper", qty: 100, size: { h: 8.5, w: 11, uom: "in" }, status: "D" },
{ item: "planner", qty: 75, size: { h: 22.85, w: 30, uom: "cm" }, status: "D" },
{ item: "postcard", qty: 45, size: { h: 10, w: 15.25, uom: "cm" }, status: "A" },
{ item: "sketchbook", qty: 80, size: { h: 14, w: 21, uom: "cm" }, status: "A" },
{ item: "sketch pad", qty: 95, size: { h: 22.85, w: 30.5, uom: "cm" }, status: "A" }
] );
MongoDB Update
db.inventory.updateOne(
{ item: "paper" },
{
$set: { "size.uom": "cm", status: "P" },
$currentDate: { lastModified: true }
}
)
MongoDB Collection Update methods
The following methods can also add new documents to a collection:
● db.collection.update() when used with the upsert: true option.
● db.collection.updateOne() when used with the upsert: true option.
● db.collection.updateMany() when used with the upsert: true option.
● db.collection.findAndModify() when used with the upsert: true option.
● db.collection.findOneAndUpdate() when used with the upsert: true option.
● db.collection.findOneAndReplace() when used with the upsert: true option.
● db.collection.save().
● db.collection.bulkWrite().
MongoDB Collection Update methods
db.inventory.insertMany( [
{ item: "journal", qty: 25, size: { h: 14, w: 21, uom: "cm" }, status: "A" },
{ item: "notebook", qty: 50, size: { h: 8.5, w: 11, uom: "in" }, status: "P" },
{ item: "paper", qty: 100, size: { h: 8.5, w: 11, uom: "in" }, status: "D" },
{ item: "planner", qty: 75, size: { h: 22.85, w: 30, uom: "cm" }, status: "D" },
{ item: "postcard", qty: 45, size: { h: 10, w: 15.25, uom: "cm" }, status: "A" },
] );
MongoDB Collection Update methods
db.inventory.deleteMany({ status : "A" })
The following example deletes the first document where status is "D":
db.inventory.deleteOne( { status: "D" } )
Data Modelling
Data Modelling One to One
Normalized
{
_id: "joe",
name: "Joe Bookreader"
}
{
patron_id: "joe",
street: "123 Fake Street",
city: "Faketon",
state: "MA",
zip: "12345"
}
Data Modelling One to One
Denormalized (Better Option)
{
_id: "joe",
name: "Joe Bookreader",
address: {
street: "123 Fake Street",
city: "Faketon",
state: "MA",
zip: "12345"
}
}
Data Modelling One to Many
{
_id: "joe",
name: "Joe Bookreader"
}
{
patron_id: "joe",
street: "123 Fake Street",
city: "Faketon",
state: "MA",
zip: "12345"
}
{
patron_id: "joe",
street: "1 Some Other Street",
city: "Boston",
state: "MA",
zip: "12345"
}
Data Modelling One to Many Denormalized
{
_id: "joe",
name: "Joe Bookreader",
addresses: [
{
street: "123 Fake Street",
city: "Faketon",
state: "MA",
zip: "12345"
},
{
street: "1 Some Other Street",
city: "Boston",
state: "MA",
zip: "12345"
}
]
}
Save the number of queries Advantage
Data Modelling One to Many Denormalized
Data Modelling
The Process of Data Modeling in MongoDB
Data modeling comes with improved database performance, but at the expense of some considerations
which include:
● Data retrieval patterns
● Balancing needs of the application such as: queries, updates and data processing
● Performance features of the chosen database engine
● The Inherent structure of the data itself
Data Modelling
MongoDB Document Structure
Documents in MongoDB play a major role in the decision making over which technique to apply for a
given set of data. There are generally two relationships between data, which are:
● Embedded Data
● Reference Data
Data Modelling
● In this case, related data is stored within a single document either as a field value or an array
within the document itself. The main advantage of this approach is that data is denormalized
and therefore provides an opportunity for manipulating the related data in a single database
operation. Consequently, this improves the rate at which CRUD operations are carried out,
hence fewer queries are required. Let’s consider an example of a document below
Data Modelling
{ "_id" : ObjectId("5b98bfe7e8b9ab9875e4c80c"),
"StudentName" : "George Beckonn",
"Settings" : {
"location" : "Embassy",
"ParentPhone" : 724765986
"bus" : "KAZ 450G",
"distance" : "4",
"placeLocation" : {
"lat" : -0.376252,
"lng" : 36.937389
}
}
}
Data Modelling
Strengths of Embedding
1. Increased data access speed: For an improved rate of access to data, embedding is the best
option since a single query operation can manipulate data within the specified document with
just a single database look-up.
2. Reduced data inconsistency: During operation, if something goes wrong (for example a network
disconnection or power failure) only a few numbers of documents may be affected since the
criteria often select a single document.
3. Reduced CRUD operations. This is to say, the read operations will actually outnumber the
writes. Besides, it is possible to update related data in a single atomic write operation. I.e for the
above data, we can update the phone number and also increase the distance with this single o
Data Modelling
db.students.updateOne({StudentName : "George Beckonn"}, {
$set: {"ParentPhone" : 72436986},
$inc: {"Settings.distance": 1}
})
Data Modelling
Weaknesses of Embedding
1. Restricted document size. All documents in MongoDB are constrained to the BSON size of 16
megabytes. Therefore, overall document size together with embedded data should not surpass
this limit. Otherwise, for some storage engines such as MMAPv1, data may outgrow and result
in data fragmentation as a result of degraded write performance.
2. Data duplication: multiple copies of the same data make it harder to query the replicated data
and it may take longer to filter embedded documents, hence outdo the core advantage of
embedding.
Data Modelling
Dot Notation
The dot notation is the identifying feature for embedded data in the programming part. It is used to
access elements of an embedded field or an array. In the sample data above, we can return information
of the student whose location is “Embassy” with this query using the dot notation.
db.users.find({'Settings.location': 'Embassy'})
Data Modelling
Flexible Schema
A flexible schema in MongoDB defines that the documents not necessarily need to have the same fields
or data type, for a field can differ across documents within a collection. The core advantage with this
concept is that one can add new fields, remove existing ones or change the field values to a new type
and hence update the document into a new structure.
For example we can have these 2 documents in the same collection
Data Modelling
{ "_id" : ObjectId("5b98bfe7e8b9ab9875e4c80c"),
"StudentName" : "George Beckonn",
"ParentPhone" : 75646344,
"age" : 10
}
{ "_id" : ObjectId("5b98bfe7e8b9ab98757e8b9a"),
"StudentName" : "Fredrick Wesonga",
"ParentPhone" : false,
}
Data Modelling
Example:
Let’s insert the data below in a client collection.
Data Modelling
However, validation can also be applied to already existing documents.
There are 3 levels of validation:
1. Strict: this is the default MongoDB validation level and it applies validation rules to all inserts and
updates.
2. Moderate: The validation rules are applied during inserts, updates and to already existing
documents that fulfill the validation criteria only.
3. Off: this level sets the validation rules for a given schema to null hence no validation will be done
to the documents.
Data Modelling
db.clients.insert([
{
"_id" : 1,
"name" : "Brillian",
"phone" : "+1 778 574 666",
"city" : "Beijing",
"status" : "Married"
},
{
"_id" : 2,
"name" : "James",
"city" : "Peninsula"
}
]
Data Modelling
If we apply the moderate validation level using:
db.runCommand( {
collMod: "test",
validator: { $jsonSchema: {
bsonType: "object",
required: [ "phone", "name" ],
properties: {
phone: {
bsonType: "string",
description: "must be a string and is required"
},
name: {
bsonType: "string",
description: "must be a string and is required"
}
}
} },
validationLevel: "moderate"
Data Modelling
Schema Validation Actions
After doing validation on documents, there may be some that may violate the validation rules. There is
always a need to provide an action when this happens.
MongoDB provides two actions that can be issued to the documents that fail the validation rules:
1. Error: this is the default MongoDB action, which rejects any insert or update in case it violates
the validation criteria.
2. Warn: This action will record the violation in the MongoDB log, but allows the insert or update
operation to be completed. For example:
Data Modelling
db.createCollection("students", {
validator: {$jsonSchema: {
bsonType: "object",
required: [ "name", "gpa" ],
properties: {
name: {
bsonType: "string",
description: "must be a string and is required"
},
gpa: {
bsonType: [ "double" ],
minimum: 0,
description: "must be a double and is required"
}
}
},
validationAction: “warn”
})
Data Modelling
db.students.insert( { name: "Amanda", status: "Updated" } )
The gpa is missing regardless of the fact that it is a required field in the schema design, but since the
validation action has been set to warn, the document will be saved and an error message will be
recorded in the MongoDB log.
Validation action can not be set on admin,local and config databases.
Data Modelling
db.createCollection( "contacts5", {
validator: { $jsonSchema: {
bsonType: "object",
required: [ "phone" ],
properties: {
phone: {
bsonType: "string",
description: "must be a string and is required"
},
email: {
bsonType : "string",
pattern : "@mongodb.com$",
description: "must be a string and match the regular expression pattern"
},
status: {
enum: [ "Unknown", "Incomplete" ],
description: "can only be one of the enum values"
}
}
} },
validationAction: "error"
} )
Data Modelling
> db.contacts5.insert( { phone: "123" ,email: "abc@mongodb.com", status: "Unknown" } )
WriteResult({ "nInserted" : 1 })
> db.contacts5.insert( { phone: 123 ,email: "abc@mongodb.com", status: "Unknown" } )
WriteResult({
"nInserted" : 0,
"writeError" : {
"code" : 121,
"errmsg" : "Document failed validation"
}
})
>
Data Modelling Normalization
Data Modelling DeNormalization
Data Modelling DeNormalization
The advantage of this is that you need one less query to get the information. The
downside is that it takes up more space and is more difficult to keep in sync. For
example, we decide that the light style should be renamed day. We would have to
update every single document where the user.accountsPref.style was light.
Data Modelling Join
db.order1.insert([
{ "_id" : 1, "item" : "almonds", "price" : 12, "quantity" : 2 },
{ "_id" : 2, "item" : "pecans", "price" : 20, "quantity" : 1 },
{ "_id" : 3 }
])
db.inventory1.insert([
{ "_id" : 1, "sku" : "almonds", description: "product 1", "instock" : 120 },
{ "_id" : 2, "sku" : "bread", description: "product 2", "instock" : 80 },
{ "_id" : 3, "sku" : "cashews", description: "product 3", "instock" : 60 },
{ "_id" : 4, "sku" : "pecans", description: "product 4", "instock" : 70 },
{ "_id" : 5, "sku": null, description: "Incomplete" },
{ "_id" : 6 }
])
Data Modelling Join
db.orders.aggregate([
{
$lookup:
{
from: "inventory",
localField: "item",
foreignField: "sku",
as: "inventory_docs"
}
}
])
Data Modelling Join
SQL equivalent is:
SELECT *, stockdata
FROM orders
WHERE stockdata IN (SELECT warehouse, instock
FROM warehouses
WHERE stock_item= orders.item
AND instock >= orders.ordered );
Data Modelling Nested Pipeline
db.orders2.insert([
{ "_id" : 1, "item" : "almonds", "price" : 12, "ordered" : 2 },
{ "_id" : 2, "item" : "pecans", "price" : 20, "ordered" : 1 },
{ "_id" : 3, "item" : "cookies", "price" : 10, "ordered" : 60 }
])
Data Modelling Nested Pipeline
db.warehouses.insert([
{ "_id" : 1, "stock_item" : "almonds", warehouse: "A", "instock" : 120 },
{ "_id" : 2, "stock_item" : "pecans", warehouse: "A", "instock" : 80 },
{ "_id" : 3, "stock_item" : "almonds", warehouse: "B", "instock" : 60 },
{ "_id" : 4, "stock_item" : "cookies", warehouse: "B", "instock" : 40 },
{ "_id" : 5, "stock_item" : "cookies", warehouse: "A", "instock" : 80 }
])
Data Modelling Nested Pipeline
db.orders2.aggregate([
{ $lookup:
{
from: "warehouses",
let: { order_item: "$item", order_qty: "$ordered" },
pipeline: [
{ $match:
{ $expr:
{ $and:
[
{ $eq: [ "$stock_item", "$$order_item" ] },
{ $gte: [ "$instock", "$$order_qty" ] }
]
}
}
},
{ $project: { stock_item: 0, _id: 0 } }
],
as: "stockdata"
}
}
])
Data Modelling Nested Pipeline
Above is equivalent to below SQL statement
SELECT *, stockdata
FROM orders
WHERE stockdata IN (SELECT warehouse, instock
FROM warehouses
WHERE stock_item= orders.item
AND instock >= orders.ordered );
Data Modelling Tools
Hackolade for mac
Data Modelling Tree Structures
Data Modelling Tree Structures Parent reference
db.categories.insert( { _id: "MongoDB", parent: "Databases" } )
db.categories.insert( { _id: "dbm", parent: "Databases" } )
db.categories.insert( { _id: "Databases", parent: "Programming" } )
db.categories.insert( { _id: "Languages", parent: "Programming" } )
db.categories.insert( { _id: "Programming", parent: "Books" } )
db.categories.insert( { _id: "Books", parent: null } )
db.categories.findOne( { _id: "MongoDB" } ).parent
You can create an index on the field parent to enable fast search by the parent node:
db.categories.createIndex( { parent: 1 } )
Data Modelling Tree Structures
You can query by the parent field to find its immediate children nodes:
db.categories.find( { parent: "Databases" } )
Data Modelling Tree Structures child as reference
db.categories4.insert( { _id: "MongoDB", children: [] } )
db.categories4.insert( { _id: "dbm", children: [] } )
db.categories4.insert( { _id: "Databases", children: [ "MongoDB", "dbm" ] } )
db.categories4.insert( { _id: "Languages", children: [] } )
db.categories4.insert( { _id: "Programming", children: [ "Databases", "Languages" ] } )
db.categories4.insert( { _id: "Books", children: [ "Programming" ] } )
Data Modelling Tree Structures child as reference
db.categories.findOne( { _id: "Databases" } ).children
Create children as index for fast search
db.categories.createIndex( { children: 1 } )
You can query for a node in the children field to find its parent node as well as its siblings:
db.categories.find( { children: "MongoDB" } )
Data Modelling Tree Structures child as reference
db.categories5.insert( { _id: "MongoDB", ancestors: [ "Books", "Programming", "Databases" ],
parent: "Databases" } )
db.categories5.insert( { _id: "dbm", ancestors: [ "Books", "Programming", "Databases" ], parent:
"Databases" } )
db.categories5.insert( { _id: "Databases", ancestors: [ "Books", "Programming" ], parent:
"Programming" } )
db.categories5.insert( { _id: "Languages", ancestors: [ "Books", "Programming" ], parent:
"Programming" } )
db.categories5.insert( { _id: "Programming", ancestors: [ "Books" ], parent: "Books" } )
db.categories5.insert( { _id: "Books", ancestors: [ ], parent: null } )
Data Modelling Tree Structures ancestors as reference
db.categories.findOne( { _id: "MongoDB" } ).ancestors
db.categories5.createIndex( { ancestors: 1 } )
You can query by the field ancestors to find all its descendants:
db.categories5.find( { ancestors: "Programming" } )
Data Modelling Tree Structures Materialized Path
db.categories7.insert( { _id: "Books", path: null } )
db.categories7.insert( { _id: "Programming", path: ",Books," } )
db.categories7.insert( { _id: "Databases", path: ",Books,Programming," }
)
db.categories7.insert( { _id: "Languages", path: ",Books,Programming," }
)
db.categories7.insert( { _id: "MongoDB", path:
",Books,Programming,Databases," } )
db.categories7.insert( { _id: "dbm", path:
",Books,Programming,Databases," } )
Data Modelling Tree Structures Materialized Path
db.categories7.find().sort( { path: 1 } )
db.categories7.find( { path: /,Programming,/ } )
db.categories7.find( { path: /^,Books,/ } )
db.categories7.createIndex( { path: 1 } )
Data Modelling Tree Structures Nested Path
Data Modelling Tree Structures Nested Path
db.categories6.insert( { _id: "Books", parent: 0, left: 1, right: 12 } )
db.categories6.insert( { _id: "Programming", parent: "Books", left: 2, right: 11 } )
db.categories6.insert( { _id: "Languages", parent: "Programming", left: 3, right: 4 } )
db.categories6.insert( { _id: "Databases", parent: "Programming", left: 5, right: 10 } )
db.categories6.insert( { _id: "MongoDB", parent: "Databases", left: 6, right: 7 } )
db.categories6.insert( { _id: "dbm", parent: "Databases", left: 8, right: 9 } )
Query to retrieve descendant:
var databaseCategory = db.categories6.findOne( { _id: "Databases" } );
db.categories6.find( { left: { $gt: databaseCategory.left }, right: { $lt: databaseCategory.right } } );
Data Modelling Operational Strategies
{ last_name : "Smith", best_score: 3.9 }
Change it to :
{ lname : "Smith", score : 3.9 }
And save 9 bytes
Data Modelling Operational Strategies
Data Lifecycle Management
Data modeling decisions should take data lifecycle management into consideration.
The Time to Live or TTL feature of collections expires documents after a period of time. Consider using the TTL
feature if your application requires some data to persist in the database for a limited period of time.
Additionally, if your application only uses recently inserted documents, consider Capped Collections. Capped
collections provide first-in-first-out (FIFO) management of inserted documents and efficiently support operations that
insert and read documents based on insertion order.
MongoDB backup Strategies
Backup Strategy #1: mongodump
Backup Strategy #2: Copying the Underlying Files
For example, Linux LVM quickly and efficiently creates a consistent snapshot of the file system that can be copied for backup and restore
purposes. To ensure that the snapshot is logically consistent, you must have journaling enabled within MongoDB.
Backup Strategy #3: MongoDB Management Service (MMS)
MongoDB Management Service provides continuous, online backup for MongoDB as a fully managed service. You install the Backup Agent
in your environment, which conducts an initial sync to MongoDB’s secure and redundant datacenters. After the initial sync, MMS streams
encrypted and compressed MongoDB oplog data to MMS so that you have a continuous backup.
MongoDB Monitoring
When it comes to MongoDB monitoring, some of the important metrics to monitor are:
● Performance stats
● Utilization of resources (CPU usage, available memory and Network usage)
● Assert stats
● Replication stats
● Saturation of resources
● Throughput operations
Applications Manager MongoDB monitoring service supports all versions of MongoDB up to version 4.0.2.
MongoDB Relationship
User
{
"_id":ObjectId("52ffc33cd85242f436000001"),
"name": "Tom Hanks",
"contact": "987654321",
"dob": "01-01-1991"
}
MongoDB Relationship
Address
{
"_id":ObjectId("52ffc4a5d85242602e000000"),
"building": "22 A, Indiana Apt",
"pincode": 123456,
"city": "Los Angeles",
"state": "California"
}
MongoDB Relationship
Modelling Embedded Relationship
{
"_id":ObjectId("52ffc33cd85242f436000001"),
"contact": "987654321",
"dob": "01-01-1991",
"name": "Tom Benzamin",
"address": [
{
"building": "22 A, Indiana Apt",
"pincode": 123456,
"city": "Los Angeles",
"state": "California"
MongoDB Relationship
This approach maintains all the related data in a single document, which makes it easy to retrieve and maintain. The whole
document can be retrieved in a single query such as −
>db.users.findOne({"name":"Tom Benzamin"},{"address":1})
MongoDB Scalability
MongoDB Scalability
The primary node receives all write operations. A replica set can have only one primary capable of confirming writes
with { w: "majority" } write concern; although in some circumstances, another mongod instance may
transiently believe itself to also be primary. [1] The primary records all changes to its data sets in its operation log, i.e.
oplog. For more information on primary node operation, see Replica Set Primary.
The secondaries replicate the primary’s oplog and apply the operations to their data sets such that the secondaries’
data sets reflect the primary’s data set. If the primary is unavailable, an eligible secondary will hold an election to elect
itself the new primary. For more information on secondary members, see Replica Set Secondary Members.
MongoDB Scalability
MongoDB Scalability
MongoDB Scalability
An arbiter will always be an arbiter whereas a primary may step down and become a secondary and a secondarymay
become the primary during an election.
MongoDB Scalability Failover
When a primary does not communicate with the other members of the set for more than the configured
electionTimeoutMillis period (10 seconds by default), an eligible secondary calls for an election to nominate
itself as the new primary. The cluster attempts to complete the election of a new primary and resume normal
operations.
MongoDB Scalability Failover
Read Preference
By default, clients read from the primary [1]; however, clients can specify a read preference to send read operations to
secondaries.
MongoDB Scalability Failover
MongoDB Replicaset
mkdir -p rs1 rs2 rs3
Localhost replica
mongod --replSet mongo1 --logpath “rs1.log” --dbpath rs1 --port 27017 &
mongod --replSet mongo1 --logpath “rs2.log” --dbpath rs2 --port 27018 &
mongod --replSet mongo1 --logpath “rs3.log” --dbpath rs3 --port 27019 &
config = { _id: "mongoRs", members:[ {_id : 0,host : "localhost:27017"}, {_id : 1,host : "localhost:27017"}] };
rs.initiate(config)
rs.status()
MongoDB Replicaset
Do the below changes in all the node:
➢ /etc/mongodb.conf file
replSet = mongoRs
/etc/hosts file
Restart mongodb
$mongo --host 192.168.64.24
config = { _id: "mongoRs", members:[ {_id : 0,host : "node1:27017"}, {_id : 1,host : "node2:27017"}] };
MongoDB Replicaset
mkdir -p rs1 rs2 rs3
mongod --replSet mongo1 --logpath “rs1.log” --dbpath rs1 --port 27017 &
mongod --replSet mongo1 --logpath “rs2.log” --dbpath rs2 --port 27018 &
mongod --replSet mongo1 --logpath “rs3.log” --dbpath rs3 --port 27019 &
MongoDB Sharding
sh.addTagRange("records.users", { zipcode: "10001" }, { zipcode: "10281" }, "NYC")
sh.addTagRange("records.users", { zipcode: "11201" }, { zipcode: "11240" }, "NYC")
sh.addTagRange("records.users", { zipcode: "94102" }, { zipcode: "94135" }, "SFO")
MongoDB Sharding
Horizontal Partitioning .
Storing rows in multiple database.
MongoDB Algorithmic Sharding
MongoDB Algorithmic Sharding
Algorithmically sharded databases use a sharding function (partition_key) ->
database_id to locate data. A simple sharding function may be “hash(key) %
NUM_DB”.
MongoDB Dynamic Sharding
MongoDB Entity Groups Sharding
MongoDB Hierarchical Sharding
MongoDB Sharding
MongoDB Hierarchical Sharding
For example, if the shard key is:
{ a: 1, b: 1, c: 1 }
The mongos program can route queries that include the full shard key or either of the following shard key
prefixes at a specific shard or set of shards:
{ a: 1 }
{ a: 1, b: 1 }
MongoDB Shard Collections Distribution
All insertOne() operations target to one shard. Each document in the insertMany() array targets to a single
shard, but there is no guarantee all documents in the array insert into a single shard.
All updateOne(), replaceOne() and deleteOne() operations must include the shard key or _id in the query
document. MongoDB returns an error if these methods are used without the shard key or _id.
Depending on the distribution of data in the cluster and the selectivity of the query, mongos may still perform a
broadcast operation to fulfill these queries
.
MongoDB Components of Sharding
The components of a Shard include
1. A Shard – This is the basic thing, and this is nothing but a MongoDB instance which holds the
subset of the data. In production environments, all shards need to be part of replica sets.
2. Config server – This is a mongodb instance which holds metadata about the cluster, basically
information about the various mongodb instances which will hold the shard data.
3. A Router – This is a mongodb instance which basically is responsible to re-directing the
commands send by the client to the right servers.
MongoDB Components of Sharding & Setup
Setup Requirements
We require the following servers for Mongodb Sharding setup:
Query server – Server A
Config server & Shards / Replica Set – Server B
Config server & Shards / Replica Set – Server C
Config server & Shards / Replica Set – Server D
MongoDB Components of Sharding
Run in All servers
$ apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10
$ echo "deb http://repo.mongodb.org/apt/ubuntu "$(lsb_release -sc)"/mongodb-org/3.2 multiverse" | sudo tee
/etc/apt/sources.list.d/mongodb-org-3.2.list
$ sudo apt-get update
$ sudo apt-get install -y mongodb-org
Skip Above steps if mongodb is already installed and apt is setup.
MongoDB Query Router Setup
echo "mongodb-org hold" | sudo dpkg --set-selections
$ echo "mongodb-org-server hold" | sudo dpkg --set-selections
$ echo "mongodb-org-shell hold" | sudo dpkg --set-selections
$ echo "mongodb-org-mongos hold" | sudo dpkg --set-selections
$ echo "mongodb-org-tools hold" | sudo dpkg --set-selections
MongoDB Sharding Config Server Setup Run in Servers B, C and D
$ mkdir -p /var/mongodb/mongo-metadata
$ mongod --configsvr --dbpath /var/mongodb/mongo-metadata --port 27019
Enabling Sharding
Mongo Shards Replica set setup in Servers B, C and D
$ mkdir -p /var/mongodb/store/rs1
$ mkdir -p /var/mongodb/store/rs2
$ mkdir -p /var/mongodb/store/rs3
Enabling Sharding
Run mongoDB shard replicaset
$ mongod --shardsvr --replSet rs1 --dbpath /var/mongodb/store/rs1 --port 27000
$ mongod --shardsvr --replSet rs2 --dbpath /var/mongodb/store/rs2 --port 27001
$ mongod --shardsvr --replSet rs1 --dbpath /var/mongodb/store/rs3 --port 27001
Sharding Steps
In Mongo Console
$mongo --port 27000
mongo> rs.initiate()
mongo> rs.add("shard.host.com:27001")
mongo> rs.add("shard.host.com:27002")
MongoDb Sharding Query Router setup
MongoDB Sharding Query Routers setup in Server A
$sudo service mongodb stop
In order to start the query router service:
# Using this same (following) command you can start multiple query servers
$ mongos --configdb config0.host.com:27019,config1.host.com:27019,config2.host.com:27019
Add Shard Server in Query Router
In Server A
$ mongo --host query0.example.com --port 27017
mongo> sh.addShard( "rep_set_name/rep_set_member:27017" )
$ mongo --port 27000
mongo> rs.status()
Enabling Sharding for a Database and Collection
$ mongo --host query0.example.com --port 27017
mongo> sh.stopBalancer()
#Enabling Sharding to the Database
mongo> use current_DB
mongo> sh.enableSharding("#{current_DB}")
Enabling Sharding to the Collection
mongo> use current_DB
mongo> sh.shardCollection( "current_DB.test_collection", { "_id" : "shard key" } )
MongoDB Indexes
What Problem do Indexes solve?
Slow Queries
Without indexes, MongoDB must perform a collection scan, i.e. scan every
document in a collection, to select those documents that match the query
statement. If an appropriate index exists for a query, MongoDB can use the
index to limit the number of documents it must inspect.
MongoDB Indexes
MongoDB Indexes
db.collection.createIndex( <key and index type specification>, <options>
)
db.collection.createIndex( { name: -1 } )
You can create indexes with a custom name, such as one that is more human-readable than the default. For example, consider
an application that frequently queries the products collection to populate data on existing inventory. The following
createIndex() method creates an index on item and quantity named query forinventory
db.products.createIndex(
{ item: 1, quantity: -1 } ,
{ name: "query for inventory" }
)
MongoDB Indexes Types
Single Field
In addition to the MongoDB-defined _id index, MongoDB supports the creation of user-defined
ascending/descending indexes on a single field of a document.
MongoDB Indexes Types
Compound Index
MongoDB also supports user-defined indexes on multiple fields, i.e. compound indexes.
The order of fields listed in a compound index has significance. For instance, if a compound index consists of
{userid: 1, score: -1 }, the index sorts first by userid and then, within each userid value, sorts by score.
MongoDB Indexes Types
Multikey Index
MongoDB uses multikey indexes to index the content stored in arrays. If you index a field that holds an array value,
MongoDB creates separate index entries for every element of the array. These multikey indexes allow queries to
select documents that contain arrays by matching on element or elements of the arrays. MongoDB automatically
determines whether to create a multikey index if the indexed field contains an array value; you do not need to
explicitly specify the multikey type.
MongoDB Indexes Types
MongoDB Indexes Types
Geospatial Index
To support efficient queries of geospatial coordinate data, MongoDB provides two special indexes: 2d indexesthat
uses planar geometry when returning results and 2dsphere indexes that use spherical geometry to return results.
See 2d Index Internals for a high level introduction to geospatial indexes.
MongoDB Indexes Types
Text Indexes
MongoDB provides a text index type that supports searching for string content in a collection. These text indexes do
not store language-specific stop words (e.g. “the”, “a”, “or”) and stem the words in a collection to only store root
words.
MongoDB Indexes Types
Hashed Indexes
To support hash based sharding, MongoDB provides a hashed index type, which indexes the hash of the value of a
field. These indexes have a more random distribution of values along their range, but only support equality matches
and cannot support range-based queries.
MongoDB Indexes Properties
Unique Indexes
The unique property for an index causes MongoDB to reject duplicate values for the indexed field. Other than the
unique constraint, unique indexes are functionally interchangeable with other MongoDB indexes.
Partial Indexes
New in version 3.2.
Partial indexes only index the documents in a collection that meet a specified filter expression. By indexing a subset
of the documents in a collection, partial indexes have lower storage requirements and reduced performance costs for
index creation and maintenance.
Partial indexes offer a superset of the functionality of sparse indexes and should be preferred over sparse indexes
MongoDB Indexes Properties
Sparse Indexes
The sparse property of an index ensures that the index only contain entries for documents that have the indexed field. The index
skips documents that do not have the indexed field.
You can combine the sparse index option with the unique index option to prevent inserting documents that have duplicate values
for the indexed field(s) and skip indexing documents that lack the indexed field(s).
TTL Indexes
TTL indexes are special indexes that MongoDB can use to automatically remove documents from a collection after a
certain amount of time. This is ideal for certain types of information like machine generated event data, logs, and
session information that only need to persist in a database for a finite amount of time.
MongoDB Indexes Uses
Indexes can improve the efficiency of read operations. The Analyze Query Performance tutorial provides an example
of the execution statistics of a query with and without an index.
For information on how MongoDB chooses an index to use, see query optimizer.
MongoDB Indexes & Collation
To use an index for string comparisons, an operation must also specify the same collation. That is, an index with a
collation cannot support an operation that performs string comparisons on the indexed fields if the operation specifies
a different collation.
For example, the collection myColl has an index on a string field category with the collation locale "fr".
db.myColl.createIndex( { category: 1 }, { collation: { locale: "fr" } } )
db.myColl.find( { category: "cafe" } ).collation( { locale: "fr" } )
db.myColl.find( { category: "cafe" } )
MongoDB Indexes & Collation
For example, the collection myColl has a compound index on the numeric fields score and price and the string
field category; the index is created with the collation locale "fr" for string comparisons:
db.myColl.createIndex(
{ score: 1, price: 1, category: 1 },
{ collation: { locale: "fr" } } )
The following operations, which use "simple" binary collation for string comparisons, can use the index:
db.myColl.find( { score: 5 } ).sort( { price: 1 } )
db.myColl.find( { score: 5, price: { $gt: NumberDecimal( "10" ) } } ).sort( { price: 1 } )
MongoDB Indexes & Collation
The following operation, which uses "simple" binary collation for string comparisons on the indexed categoryfield,
can use the index to fulfill only the score: 5 portion of the query:
db.myColl.find( { score: 5, category: "cafe" } )
MongoDB Covered Queries
MongoDB Operations that support collation
Operations that Support Collation
All reading, updating, and deleting methods support collation. Some examples are listed below.
find() and sort()
Individual queries can specify a collation to use when matching and sorting results. The following query and sort
operation uses a German collation with the locale parameter set to de.
MongoDB Aggregation Types
● Pipeline
● MapReduce
● Single Purpose
MongoDB Aggregation MapReduce
MongoDB Aggregation Pipeline
● Pipeline
● Pipeline Expressions
● Aggregation Pipeline Behavior
● Considerations
MongoDB Aggregation Pipeline
db.orders.insertMany([
{ item: "journal", amount: 25, size: { h: 14, w: 21, uom: "cm" },
status: "A" },
{ item: "notebook", amount: 50, size: { h: 8.5, w: 11, uom: "in" },
status: "A" },
{ item: "paper", amount: 100, size: { h: 8.5, w: 11, uom: "in" },
status: "D" },
{ item: "planner", amount: 75, size: { h: 22.85, w: 30, uom: "cm" },
status: "D" },
{ item: "postcard", amount: 45, size: { h: 10, w: 15.25, uom: "cm" },
status: "A" }
]);
MongoDB Aggregation Pipeline
db.orders.aggregate([
{ $match: { status: "A" } },
{ $group: { _id: "$cust_id", total: { $sum: "$amount" } } }
])
*This works as Unix Pipes
First Stage: The $match stage filters the documents by the status field and passes to the next stage those
documents that have status equal to "A".
Second Stage: The $group stage groups the documents by the cust_id field to calculate the sum of the amount
for each unique cust_id.
MongoDB Aggregation Pipeline
The MongoDB aggregation pipeline consists of stages. Each stage transforms the documents as they pass through
the pipeline. Pipeline stages do not need to produce one output document for every input document; e.g., some
stages may generate new documents or filter out documents.
Pipeline stages can appear multiple times in the pipeline with the exception of $out, $merge, and $geoNearstages.
For a list of all available stages, see Aggregation Pipeline Stages.
MongoDB provides the db.collection.aggregate() method in the mongo shell and the aggregatecommand
to run the aggregation pipeline.
For example usage of the aggregation pipeline, consider Aggregation with User Preference Data and Aggregation
with the Zip Code Data Set.
MongoDB Aggregation Single Purpose
> db.orde.insert({cust_id : "A123" , amount : 500 , status : "A" } , { cust_id : "A123" , amount:250 , status: "A"}, { cust_id : "B212" ,
amount:200 , status: "A"} , { cust_id : "A123" , amount:300 , status: "D"} )
WriteResult({ "nInserted" : 1 })
>
> db.ord.distinct("cust_id")
[ "A123" ]
MongoDB Performance
● Locking Performance
● Number of Connections
● Database Profiling
● Full Time Diagnostic Data Capture
MongoDB Perfomance
Number of Connections
In some cases, the number of connections between the applications and the database can overwhelm the ability of
the server to handle requests. The following fields in the serverStatus document can provide insight:
● connections is a container for the following two fields:
○ connections.current the total number of current clients connected to the database instance.
○ connections.available the total number of unused connections available for new clients.
If there are numerous concurrent application requests, the database may have trouble keeping up with demand. If this
is the case, then you will need to increase the capacity of your deployment.
MongoDB Perfomance
Database Profiling
The Database Profiler collects detailed information about operations run against a mongod instance. The profiler’s
output can help to identify inefficient queries and operations.
You can enable and configure profiling for individual databases or for all databases on a mongod instance. Profiler
settings affect only a single mongod instance and will not propagate across a replica set or sharded cluster.
See Database Profiler for information on enabling and configuring the profiler.
MongoDB Perfomance
The following profiling levels are available:
Level
Description
0 The profiler is off and does not collect any data. This is the default profiler level.
1 The profiler collects data for operations that take longer than the value of slowms.
2 The profiler collects data for all operations.
MongoDB Perfomance - DB profiling
Enable and Configure Database Profiling
This section uses the mongo shell helper db.setProfilingLevel() helper to enable profiling. For instructions using the
driver, see your driver documentation.
When you enable profiling for a mongod instance, you set the profiling level to a value greater than 0. The profiler records data
in the system.profile collection. MongoDB creates the system.profile collection in a database after you enable profiling
for that database.
To enable profiling and set the profiling level, pass the profiling level to the db.setProfilingLevel() helper. For example,
to enable profiling for all database operations, consider the following operation in the mongo shell:
MongoDB Performance - DB profiling
db.setProfilingLevel(2)
{ "was" : 0, "slowms" : 100, "sampleRate" : 1.0, "ok" : 1 }
MongoDB Performance - DB profiling
Specify the Threshold for Slow Operations
By default, the slow operation threshold is 100 milliseconds. To change the slow operation threshold, specify the desired
threshold value in one of the following ways:
● Set the value of slowms using the profile command or db.setProfilingLevel() shell helper method.
● Set the value of --slowms from the command line at startup.
● Set the value of slowOpThresholdMs in a configuration file.
For example, the following code sets the profiling level for the current mongod instance to 1 and sets the slow operation
threshold for the mongod instance to 20 milliseconds:
db.setProfilingLevel(1, { slowms: 20 })
Profiling level of 1 will profile operations slower than the threshold.
MongoDB Performance - DB profiling
IMPORTANT
The slow operation threshold applies to all databases in a mongod instance. It is used by both the database
profiler and the diagnostic log and should be set to the highest useful value to avoid performance degradation.
MongoDB Performance - DB profiling
Check Profiling Level
To view the profiling level, issue the following from the mongo shell:
db.getProfilingStatus()
To enable profiling for a mongod instance, pass the following options to mongod at startup.
$mongod --profile 1 --slowms 15 --slowOpSampleRate 0.5
Example Data profiler Queries
This section displays example queries to the system.profile collection. For an explanation of the
query output, see Database Profiler Output.
To return the most recent 10 log entries in the system.profile collection, run a query similar to the
following:
>db.system.profile.find().limit(10).sort( { ts : -1 } ).pretty()
>db.system.profile.find( { op: { $ne : 'command' } } ).pretty()
Example Data profiler Queries
To return operations for a particular collection, run a query similar to the following. This example
returns operations in the mydb database’s test collection:
>db.system.profile.find( { ns : 'mydb.test' } ).pretty()
>db.system.profile.find( { millis : { $gt : 5 } } ).pretty()
Example Data profiler Queries based on time
To return operations for a particular collection, run a query similar to the following. This example
returns operations in the mydb database’s test collection.
>db.system.profile.find( { ns : 'mydb.test' } ).pretty()
To return operations slower than 5 milliseconds, run a query similar to the following:
>db.system.profile.find( { millis : { $gt : 5 } } ).pretty()
Example Data profiler Queries based on time
To return information from a certain time range, run a query similar to the following:
db.system.profile.find({
ts : {
$gt: new ISODate("2019-11-09T03:00:00Z"),
$lt: new ISODate("2012-11-09T03:40:00Z")
}
}).pretty()
Example Data create new profile
For example, to create a new system.profile collections that’s 4000000 bytes, use the following sequence of operations in
the mongo shell:
db.setProfilingLevel(0)
db.system.profile.drop()
db.createCollection( "system.profile", { capped: true, size:4000000 } )
db.setProfilingLevel(1)
MongoDB security
Authentication
Authentication
SCRAM
x.509
MongoDB security
Role Based Security
Role-Based Access Control
Enable Access Control
Manage Users and Roles
MongoDB security
TLS/SSL (Transport Encryption)
Configure mongod and mongos for TLS/SSL
TLS/SSL Configuration for Clients
MongoDB security
Encrypt Communication
Configure MongoDB to use TLS/SSL for all incoming and outgoing connections. Use TLS/SSL to encrypt
communication between mongod and mongos components of a MongoDB deployment as well as between all
applications and MongoDB.
Starting in version 4.0, MongoDB uses the native TLS/SSL OS libraries:
MongoDB security
Start MongoDB without access control.
mongod --port 27017 --dbpath /var/lib/mongodb
Connect to Instance
mongo --port 27017
MongoDB security
use admin
db.createUser(
{
user: "myUserAdmin",
pwd: passwordPrompt(), // or cleartext password
roles: [ { role: "userAdminAnyDatabase", db: "admin" }, "readWriteAnyDatabase" ]
}
)
MongoDB security
Re-start the MongoDB instance with access control.
db.adminCommand( { shutdown: 1 } )
From the terminal, re-start the mongod instance with the --auth command line option or, if using a configuration
file, the security.authorization setting.
mongod --auth --port 27017 --dbpath /var/lib/mongodb
MongoDB security
Start a mongo shell with the -u <username>, -p, and the --authenticationDatabase<database>
command line options:
$mongo --host 192.168.1.103 --port 27017 --authenticationDatabase "admin" -u "myUserAdmin" -p
“qwerty”
MongoDB security
use test
db.createUser(
{
user: "myTester",
pwd: passwordPrompt(), // or cleartext password
roles: [ { role: "readWrite", db: "test" },
{ role: "read", db: "reporting" } ]
}
)
MongoDB security
mongo --host 192.168.1.103 --port 27017 -u "myTester" --authenticationDatabase "test" -p “asdfg”
db.foo.insert( { x: 1, y: 1 } )
MongoDB Authentication
● Authentication Methods
● Authentication Mechanisms
● Internal Authentication
● Authentication on Sharded Clusters
MongoDB Authentication
● use reporting
● db.createUser(
● {
● user: "reportsUser",
● pwd: passwordPrompt(), // or cleartext password
● roles: [
● { role: "read", db: "reporting" },
● { role: "read", db: "products" },
● { role: "read", db: "sales" },
● { role: "readWrite", db: "accounts" }
● ]
● }
● )
MongoDB LDAP Authentication
Users that will authenticate to MongoDB using an external authentication mechanism, such as LDAP, must be created
in the $external database, which allows mongos or mongod to consult an external source for authentication.
Changed in version 3.6.3: To use sessions with $external authentication users (i.e. Kerberos, LDAP, x.509 users),
the usernames cannot be greater than 10k bytes.
For LDAP authentication, you must specify a username. You do not need to specify the password, as that is handled by
the LDAP service.
The following operation adds the reporting user with read-only access to the records database.

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

PostgreSQL Database Slides
PostgreSQL Database SlidesPostgreSQL Database Slides
PostgreSQL Database Slides
 
Indexing with MongoDB
Indexing with MongoDBIndexing with MongoDB
Indexing with MongoDB
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Introduction to MongoDB.pptx
Introduction to MongoDB.pptxIntroduction to MongoDB.pptx
Introduction to MongoDB.pptx
 
Introduction to PostgreSQL
Introduction to PostgreSQLIntroduction to PostgreSQL
Introduction to PostgreSQL
 
MongoDB 101
MongoDB 101MongoDB 101
MongoDB 101
 
An introduction to MongoDB
An introduction to MongoDBAn introduction to MongoDB
An introduction to MongoDB
 
Mongo DB 102
Mongo DB 102Mongo DB 102
Mongo DB 102
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Postgresql
PostgresqlPostgresql
Postgresql
 
Mongo DB
Mongo DB Mongo DB
Mongo DB
 
PostgreSQL- An Introduction
PostgreSQL- An IntroductionPostgreSQL- An Introduction
PostgreSQL- An Introduction
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Introduction to mongodb
Introduction to mongodbIntroduction to mongodb
Introduction to mongodb
 
Mongo DB Presentation
Mongo DB PresentationMongo DB Presentation
Mongo DB Presentation
 
MongoDB - Aggregation Pipeline
MongoDB - Aggregation PipelineMongoDB - Aggregation Pipeline
MongoDB - Aggregation Pipeline
 
Polyglot Persistence
Polyglot Persistence Polyglot Persistence
Polyglot Persistence
 
Mongo db dhruba
Mongo db dhrubaMongo db dhruba
Mongo db dhruba
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
MongoDB
MongoDBMongoDB
MongoDB
 

Semelhante a MongoDB

introtomongodb
introtomongodbintrotomongodb
introtomongodb
saikiran
 
Everything You Need to Know About MongoDB Development.pptx
Everything You Need to Know About MongoDB Development.pptxEverything You Need to Know About MongoDB Development.pptx
Everything You Need to Know About MongoDB Development.pptx
75waytechnologies
 

Semelhante a MongoDB (20)

Mongodb By Vipin
Mongodb By VipinMongodb By Vipin
Mongodb By Vipin
 
Mongodb Introduction
Mongodb IntroductionMongodb Introduction
Mongodb Introduction
 
Mongo db
Mongo dbMongo db
Mongo db
 
Introduction To MongoDB
Introduction To MongoDBIntroduction To MongoDB
Introduction To MongoDB
 
Introduction to MongoDB and its best practices
Introduction to MongoDB and its best practicesIntroduction to MongoDB and its best practices
Introduction to MongoDB and its best practices
 
introtomongodb
introtomongodbintrotomongodb
introtomongodb
 
Mongo DB
Mongo DBMongo DB
Mongo DB
 
MongoDB - An Introduction
MongoDB - An IntroductionMongoDB - An Introduction
MongoDB - An Introduction
 
MongoDB DOC v1.5
MongoDB DOC v1.5MongoDB DOC v1.5
MongoDB DOC v1.5
 
MongoDB - An Introduction
MongoDB - An IntroductionMongoDB - An Introduction
MongoDB - An Introduction
 
Mongodb Introduction
Mongodb Introduction Mongodb Introduction
Mongodb Introduction
 
MongoDB NoSQL database a deep dive -MyWhitePaper
MongoDB  NoSQL database a deep dive -MyWhitePaperMongoDB  NoSQL database a deep dive -MyWhitePaper
MongoDB NoSQL database a deep dive -MyWhitePaper
 
Building your first app with MongoDB
Building your first app with MongoDBBuilding your first app with MongoDB
Building your first app with MongoDB
 
Top MongoDB interview Questions and Answers
Top MongoDB interview Questions and AnswersTop MongoDB interview Questions and Answers
Top MongoDB interview Questions and Answers
 
MongoDB NoSQL - Developer Guide
MongoDB NoSQL - Developer GuideMongoDB NoSQL - Developer Guide
MongoDB NoSQL - Developer Guide
 
MongoDB using Grails plugin by puneet behl
MongoDB using Grails plugin by puneet behlMongoDB using Grails plugin by puneet behl
MongoDB using Grails plugin by puneet behl
 
Mongodb
MongodbMongodb
Mongodb
 
MongoDB installation,CRUD operation & JavaScript shell
MongoDB installation,CRUD operation & JavaScript shellMongoDB installation,CRUD operation & JavaScript shell
MongoDB installation,CRUD operation & JavaScript shell
 
Mongo db basics
Mongo db basicsMongo db basics
Mongo db basics
 
Everything You Need to Know About MongoDB Development.pptx
Everything You Need to Know About MongoDB Development.pptxEverything You Need to Know About MongoDB Development.pptx
Everything You Need to Know About MongoDB Development.pptx
 

Mais de wiTTyMinds1

Mais de wiTTyMinds1 (11)

Entrepreneurship & Skills Development
Entrepreneurship & Skills DevelopmentEntrepreneurship & Skills Development
Entrepreneurship & Skills Development
 
Azure 300
Azure 300Azure 300
Azure 300
 
Azure 103 Certification Course
Azure 103 Certification CourseAzure 103 Certification Course
Azure 103 Certification Course
 
Machine Learning Cetification
Machine Learning CetificationMachine Learning Cetification
Machine Learning Cetification
 
AWS architect certification course
AWS architect certification course AWS architect certification course
AWS architect certification course
 
Bike Booking App UI/UX Design
Bike Booking App UI/UX DesignBike Booking App UI/UX Design
Bike Booking App UI/UX Design
 
Wittyminds portfolio tech courses
Wittyminds portfolio tech coursesWittyminds portfolio tech courses
Wittyminds portfolio tech courses
 
WittyMinds Portfolio Tech
WittyMinds Portfolio TechWittyMinds Portfolio Tech
WittyMinds Portfolio Tech
 
Wittyminds Portfolio
Wittyminds PortfolioWittyminds Portfolio
Wittyminds Portfolio
 
Dockers & kubernetes detailed - Beginners to Geek
Dockers & kubernetes detailed - Beginners to GeekDockers & kubernetes detailed - Beginners to Geek
Dockers & kubernetes detailed - Beginners to Geek
 
IoTCourse.pptx
IoTCourse.pptxIoTCourse.pptx
IoTCourse.pptx
 

Último

Último (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 

MongoDB

  • 2. MongoDB Overview MongoDB is a cross-platform document-oriented database program. Classified as a NoSQL database program, MongoDB uses JSON-like documents with schema. MongoDB is developed by MongoDB Inc. and licensed under the Server Side Public License. EMPLOYEE NAME ID COMPANY
  • 3. NoSQL DB Overview Unlike MySQL and Oracle Databases. NoSQL Deals in files . NoSQL, which stand for "not only SQL," is an alternative to traditional relational databases in which data is placed in tables and data schema is carefully designed before the database is built. NoSQL databases are especially useful for working with large sets of distributed data.
  • 4. JSON Introduction JSON (JavaScript Object Notation) is a lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate. It is based on a subset of the JavaScript Programming Language Standard ECMA-262 3rd Edition - December 1999. .
  • 6. JSON Data types At the granular level, JSON consist of 6 data types. First four data types (string, number, boolean and null) can be referred as simple data types. Other two data types (object and array) can be referred as complex data types. 1. string 2. number 3. boolean 4. null/empty 5. object 6. array 7.
  • 8. MongoDB Overview Document Database A record in MongoDB is a document, which is a data structure composed of field and value pairs. MongoDB documents are similar to JSON objects. The values of fields may include other documents, arrays, and arrays of documents.
  • 9. MongoDB Collections/Views MongoDB stores documents in collections. Collections are analogous to tables in relational databases. In addition to collections, MongoDB supports: ● Read-only Views (Starting in MongoDB 3.4) ● On-Demand Materialized Views (Starting in MongoDB 4.2
  • 10. MongoDB Overview MongoDB is free and open source database. It is a NoSQL database use Json like schema. It is a cross platform database and very easy to deploy on the cloud and server. It is of the most important databases you can work with these days. MongoDB makes working with data simple. It prioritizes performance and efficiency. It is very popular. MongoDB developers are in high demand.
  • 11. MongoDB CRUD ● How to perform CRUD (Create, Read, Update, Delete) operations on MongoDB databases
  • 12. MongoDB Create ● db.collection.insertOne() New in version 3.2 ● db.collection.insertMany() New in version 3.2
  • 13. MongoDB Read db.collection.find() You can specify query filters or criteria that identify the documents to return.
  • 14. MongoDB Update ● db.collection.updateOne() New in version 3.2 ● db.collection.updateMany() New in version 3.2 ● db.collection.replaceOne() New in version 3.2
  • 15. MongoDB Delete Delete operations remove documents from a collection. MongoDB provides the following methods to delete documents of a collection: ● db.collection.deleteOne() New in version 3.2 ● db.collection.deleteMany() New in version 3.2
  • 16. MongoDB Objectives ● How to perform CRUD (Create, Read, Update, Delete) operations on MongoDB databases ● How to filter for data efficiently ● How to work with both the Mongo Shell and drivers (e.g. Node.js driver) ● How to increase performance by using indexes (and how to use the right indexes!) ● How to use the amazing “Aggregation Framework” that’s built into MongoDB ● What replica sets and sharding are ● How to use MongoDB Atlas — the cloud solution offered by MongoDB ● How to use the serverless platform (Stitch) offered by MongoDB
  • 17. MongoDB Advantages ● Schema less − MongoDB is a document database in which one collection holds different documents. Number of fields, content and size of the document can differ from one document to another. ● Structure of a single object is clear. ● No complex joins. ● Deep query-ability. MongoDB supports dynamic queries on documents using a document-based query language that's nearly as powerful as SQL. ● Tuning. ● Ease of scale-out − MongoDB is easy to scale. ● Conversion/mapping of application objects to database objects not needed. ● Uses internal memory for storing the (windowed) working set, enabling faster access of data. ● Scalability ●
  • 18. MongoDB Objectives ● Download and install MongoDB ● Modify environment variables ● Start and stop mongoDB server ● Connect to MongoDB using python ● Perform CRUD operations
  • 19. MongoDB Objectives ● Create a database ● Create a collection ● Insert documents ● Combine collections ● Import data into mongoDB ● Backup and restore mongoDB ● Create indexes
  • 20. Why Use MongoDB? ● Document Oriented Storage − Data is stored in the form of JSON style documents. ● Index on any attribute ● Replication and high availability ● Auto-sharding ● Rich queries ● Fast in-place updates ● Professional support by MongoDB
  • 21. Where to use MongoDB? ● Big Data ● Content Management and Delivery ● Mobile and Social Infrastructure ● User Data Management ● Data Hub
  • 22. Where to use MongoDB? ● Big Data ● Content Management and Delivery ● Mobile and Social Infrastructure ● User Data Management ● Data Hub
  • 23. MongoDB Data Modelling Suppose a client needs a database design for his blog/website and see the differences between RDBMS and MongoDB schema design. Website has the following requirements. ● Every post has the unique title, description and url. ● Every post can have one or more tags. ● Every post has the name of its publisher and total number of likes. ● Every post has comments given by users along with their name, message, data-time and likes. ● On each post, there can be zero or more comments
  • 25. MongoDB Structure { _id: POST_ID title: TITLE_OF_POST, description: POST_DESCRIPTION, by: POST_BY, url: URL_OF_POST, tags: [TAG1, TAG2, TAG3], likes: TOTAL_LIKES, comments: [ { user:'COMMENT_BY', message: TEXT, dateCreated: DATE_TIME, like: LIKES }, { user:'COMMENT_BY', message: TEXT, dateCreated: DATE_TIME, like: LIKES } ] }
  • 26. MongoDB Installation $sudo apt-get update $sudo apt install -y mongodb $sudo systemctl status mongodb $mongo --eval 'db.runCommand({ connectionStatus: 1 })' $mongo >show databases >use mydb >show collections
  • 27. MongoDB Conf file and Settings /etc/mongoDB.conf A given mongo database is broken up into a series of BSON files on disk, with increasing size up to 2GB. BSON is its own format, built specifically for MongoDB.
  • 28. MongoDB login $ mongo Will try to connect to localhost Using ip address and port $ mongo --host 192.168.64.24:27017 Or $mongo --host 192.168.64.24 --port 27017
  • 29. MongoDB User Creation 1. Set up your user use cool_db db.createUser({ user: 'john', pwd: 'secretPassword', roles: [{ role: 'readWrite', db:'test'}] })
  • 30. MongoDB User Creation mongo -u john -p secretPassword 192.168.64.24/test MongoDB shell version v3.6.3 connecting to: mongodb://192.168.64.24:27017/test MongoDB server version: 3.6.3 > db.inv.insert({"name":"abc1", "add":"street3"}) WriteResult({ "nInserted" : 1 })
  • 31. MongoDB User Creation 1. Set up your user for read only use cool_db db.createUser({ user: 'Sya', pwd: 'secretPassword', roles: [{ role: 'read', db:'test'}] })
  • 32. MongoDB User Creation $ mongo -u Syam -p secretPassword 192.168.64.24/test MongoDB shell version v3.6.3 connecting to: mongodb://192.168.64.24:27017/test MongoDB server version: 3.6.3 > db.inv.insert({"name":"abc1", "add":"street3"}) WriteResult({ "writeError" : { "code" : 13, "errmsg" : "not authorized on test to execute command { insert: "inv", ordered: true, $db: "test" }" } })
  • 33. MongoDB User login $ mongo -u john -p secretPassword 192.168.64.24/test
  • 35. MongoDB security authorization Create a user before uncommenting auth=true in mongodb.conf db.createUser( ... { ... user: "myUserAdmin", ... pwd: "abc123", ... roles: [ { role: "userAdminAnyDatabase", db: "admin" } ] ... } ... ) Again uncomment the auth=true and restart mongodb mongo --host 192.168.64.24 --port 27017 -u myUserAdmin -p --authenticationDatabase admin
  • 36. MongoDB Create DB MongoDB use DATABASE_NAME is used to create database. The command will create a new database if it doesn't exist, otherwise it will return the existing database. If you didn't create any database, then collections will be stored in test database. >use mydb switched to db mydb >db.dropDatabase()
  • 37. MongoDB Create Create collection explicitly ● db.createCollection("mycol", { capped : true, autoIndexId : true, size : ● 6142800, max : 10000 } )
  • 39. MongoDB Create Collection The cool thing about MongoDB is that you need not to create collection before you insert document in it. With a single command you can insert a document in the collection and the MongoDB creates that collection on the fly. Syntax: db.collection_name.insert({key:value, key:value…})
  • 40. MongoDB Create Collection db.beginnersbook.insert({ name: "Chaitanya", age: 30, website: "beginnersbook.com" }) db.collection_name.insert Find the Details
  • 41. MongoDB Insert db.post.insert([ { title: 'MongoDB Overview', description: 'MongoDB is no sql database', by: 'tutorials point', url: 'http://www.tutorialspoint.com', tags: ['mongodb', 'database', 'NoSQL'], likes: 100 }, { title: 'NoSQL Database', description: "NoSQL database doesn't have tables", by: 'tutorials point', url: 'http://www.tutorialspoint.com', tags: ['mongodb', 'database', 'NoSQL'], likes: 20, comments: [ { user:'user1', message: 'My first comment', dateCreated: new Date(2013,11,10,2,35), like: 0 } ] } ])
  • 42. MongoDB Collection insert methods db.collection.insertOne() Inserts a single document into a collection. db.collection.insertMany() db.collection.insertMany() inserts multiple documents into a collection db.collection.insert() db.collection.insert() inserts a single document or multiple documents into a collection
  • 43. MongoDB Collection insert The insert() Method db.COLLECTION_NAME.insert(document) Example: db.mycol.insert({ _id: ObjectId(7df78ad8902c), title: 'MongoDB Overview', description: 'MongoDB is no sql database', by: 'tutorials point', url: 'http://www.tutorialspoint.com', tags: ['mongodb', 'database', 'NoSQL'], likes: 100 })
  • 44. MongoDB Collection insert multiple We can pass array of document also : db.post.insert([ { title: 'MongoDB Overview', description: 'MongoDB is no sql database', by: 'tutorials point', url: 'http://www.tutorialspoint.com', tags: ['mongodb', 'database', 'NoSQL'], likes: 100 }, { title: 'NoSQL Database', description: "NoSQL database doesn't have tables", by: 'tutorials point', url: 'http://www.tutorialspoint.com', tags: ['mongodb', 'database', 'NoSQL'], likes: 20, comments: [ { user:'user1', message: 'My first comment', dateCreated: new Date(2013,11,10,2,35), like: 0 } ]
  • 45. MongoDB Collection find method db.COLLECTION_NAME.find() find() method will display all the documents in a non-structured way. The pretty() Method To display the results in a formatted way, you can use pretty() method. Syntax >db.mycol.find().pretty()
  • 46. MongoDB Collection InsertOne db.products.insertOne( { item: "card", qty: 15 } ); Products will be created automatically.
  • 47. MongoDB Collection find method >db.mycol.find().pretty() { "_id": ObjectId(7df78ad8902c), "title": "MongoDB Overview" , "description": "MongoDB is no sql database" , "by": "tutorials point" , "url": "http://www.tutorialspoint.com" , "tags": ["mongodb", "database", "NoSQL"], "likes": "100" } >
  • 48. MongoDB Collection insert data db.inventory.insertMany([ { item: "journal", qty: 25, size: { h: 14, w: 21, uom: "cm" }, status: "A" }, { item: "notebook", qty: 50, size: { h: 8.5, w: 11, uom: "in" }, status: "A" }, { item: "paper", qty: 100, size: { h: 8.5, w: 11, uom: "in" }, status: "D" }, { item: "planner", qty: 75, size: { h: 22.85, w: 30, uom: "cm" }, status: "D" }, { item: "postcard", qty: 45, size: { h: 10, w: 15.25, uom: "cm" }, status: "A" } ]);
  • 49. MongoDB Collection select data db.inventory.find( {} ) This operation corresponds to the following SQL statement: SELECT * FROM inventory db.inventory.find( { status : "D" } ) This operation corresponds to the following SQL statement: SELECT * FROM inventory WHERE status = "D"
  • 50. MongoDB Collection select data The following example retrieves all documents from the inventory collection where status equals either "A"or "D": db.inventory.find( { status: { $in: [ "A", "D" ] } } ) Select * from inventory where status in ( “A”,”D”)
  • 51. MongoDB Collection select data The following example retrieves all documents from the inventory collection where status equals either "A"or "D": db.inventory.find( { status: { $nin: [ "A", "D" ] } } ) Select * from inventory where status not in ( “A”,”D”)
  • 52. MongoDB Collection select data #count the number of resultant rows db.inventory.count( { status: { $nin: [ "A", "D" ] } } ) db.inventory.find( { status: "A", qty: { $lt: 30 } } ) The operation corresponds to the following SQL statement: SELECT * FROM inventory WHERE status = "A" AND qty < 30
  • 53. OR in MongoDB Using the $or operator, you can specify a compound query that joins each clause with a logical OR conjunction so that the query selects the documents in the collection that match at least one condition. db.inventory.find( { $or: [ { status: "A" }, { qty: { $lt: 30 } } ] } ) SELECT * FROM inventory WHERE status = "A" OR qty < 30 db.inventory.find( { $and: [ { status: "A" }, { qty: { $lt: 30 } } ] } )
  • 54. Exporting mongodb collection in json file $ sudo mongoexport --host 192.168.64.24 --db config --collection inventory --out /home/parallels/status.json
  • 55. OR in MongoDB To query documents based on the OR condition, you need to use $or keyword. Following is the basic syntax of OR − >db.mycol.find( { $or: [ {key1: value1}, {key2:value2} ] } ).pretty() Using the $or operator, you can specify a compound query that joins each clause with a logical OR conjunction so that the query selects the documents in the collection that match at least one condition.
  • 56. OR in MongoDB Following example will show all the tutorials written by 'tutorials point' or whose title is 'MongoDB Overview'. >db.mycol.find({$or:[{"by":"tutorials point"},{"title": "MongoDB Overview"}]}).pretty() { "_id": ObjectId(7df78ad8902c), "title": "MongoDB Overview", "description": "MongoDB is no sql database", "by": "tutorials point", "url": "http://www.tutorialspoint.com", "tags": ["mongodb", "database", "NoSQL"], "likes": "100" } >
  • 57. AND and OR in MongoDB where likes>10 AND (by = 'tutorials point' OR title = 'MongoDB Overview')' >db.mycol.find({"likes": {$gt:10}, $or: [{"by": "tutorials point"}, {"title": "MongoDB Overview"}]}).pretty() { "_id": ObjectId(7df78ad8902c), "title": "MongoDB Overview", "description": "MongoDB is no sql database", "by": "tutorials point", "url": "http://www.tutorialspoint.com", "tags": ["mongodb", "database", "NoSQL"], "likes": "100" }
  • 58. insertMany db.collection.insertMany( [ <document 1> , <document 2>, ... ], { writeConcern: <document>, ordered: <boolean> } )
  • 59. insertMany check null db.inventory.insertMany([ { _id: 1, item: null }, { _id: 2 } ]) db.inventory.find( { item: null } ) The { item : { $type: 10 } } query matches only documents that contain the item field whose value isnull; i.e. the value of the item field is of BSON Type Null (type number 10) : db.inventory.find( { item : { $type: 10 } } )
  • 60. insertMany check null db.inventory.find( { item : { $exists: false } } )
  • 61. MongoDB Update ● db.collection.updateOne(<filter>, <update>, <options>) ● db.collection.updateMany(<filter>, <update>, <options>) ● db.collection.replaceOne(<filter>, <update>, <options>) db.inventory.insertMany( [ { item: "canvas", qty: 100, size: { h: 28, w: 35.5, uom: "cm" }, status: "A" }, { item: "journal", qty: 25, size: { h: 14, w: 21, uom: "cm" }, status: "A" }, { item: "mat", qty: 85, size: { h: 27.9, w: 35.5, uom: "cm" }, status: "A" }, { item: "mousepad", qty: 25, size: { h: 19, w: 22.85, uom: "cm" }, status: "P" }, { item: "notebook", qty: 50, size: { h: 8.5, w: 11, uom: "in" }, status: "P" }, { item: "paper", qty: 100, size: { h: 8.5, w: 11, uom: "in" }, status: "D" }, { item: "planner", qty: 75, size: { h: 22.85, w: 30, uom: "cm" }, status: "D" }, { item: "postcard", qty: 45, size: { h: 10, w: 15.25, uom: "cm" }, status: "A" }, { item: "sketchbook", qty: 80, size: { h: 14, w: 21, uom: "cm" }, status: "A" }, { item: "sketch pad", qty: 95, size: { h: 22.85, w: 30.5, uom: "cm" }, status: "A" } ] );
  • 62. MongoDB Update db.inventory.insertMany( [ { item: "canvas", qty: 100, size: { h: 28, w: 35.5, uom: "cm" }, status: "A" }, { item: "journal", qty: 25, size: { h: 14, w: 21, uom: "cm" }, status: "A" }, { item: "mat", qty: 85, size: { h: 27.9, w: 35.5, uom: "cm" }, status: "A" }, { item: "mousepad", qty: 25, size: { h: 19, w: 22.85, uom: "cm" }, status: "P" }, { item: "notebook", qty: 50, size: { h: 8.5, w: 11, uom: "in" }, status: "P" }, { item: "paper", qty: 100, size: { h: 8.5, w: 11, uom: "in" }, status: "D" }, { item: "planner", qty: 75, size: { h: 22.85, w: 30, uom: "cm" }, status: "D" }, { item: "postcard", qty: 45, size: { h: 10, w: 15.25, uom: "cm" }, status: "A" }, { item: "sketchbook", qty: 80, size: { h: 14, w: 21, uom: "cm" }, status: "A" }, { item: "sketch pad", qty: 95, size: { h: 22.85, w: 30.5, uom: "cm" }, status: "A" } ] );
  • 63. MongoDB Update db.inventory.updateOne( { item: "paper" }, { $set: { "size.uom": "cm", status: "P" }, $currentDate: { lastModified: true } } )
  • 64. MongoDB Collection Update methods The following methods can also add new documents to a collection: ● db.collection.update() when used with the upsert: true option. ● db.collection.updateOne() when used with the upsert: true option. ● db.collection.updateMany() when used with the upsert: true option. ● db.collection.findAndModify() when used with the upsert: true option. ● db.collection.findOneAndUpdate() when used with the upsert: true option. ● db.collection.findOneAndReplace() when used with the upsert: true option. ● db.collection.save(). ● db.collection.bulkWrite().
  • 65. MongoDB Collection Update methods db.inventory.insertMany( [ { item: "journal", qty: 25, size: { h: 14, w: 21, uom: "cm" }, status: "A" }, { item: "notebook", qty: 50, size: { h: 8.5, w: 11, uom: "in" }, status: "P" }, { item: "paper", qty: 100, size: { h: 8.5, w: 11, uom: "in" }, status: "D" }, { item: "planner", qty: 75, size: { h: 22.85, w: 30, uom: "cm" }, status: "D" }, { item: "postcard", qty: 45, size: { h: 10, w: 15.25, uom: "cm" }, status: "A" }, ] );
  • 66. MongoDB Collection Update methods db.inventory.deleteMany({ status : "A" }) The following example deletes the first document where status is "D": db.inventory.deleteOne( { status: "D" } )
  • 68. Data Modelling One to One Normalized { _id: "joe", name: "Joe Bookreader" } { patron_id: "joe", street: "123 Fake Street", city: "Faketon", state: "MA", zip: "12345" }
  • 69. Data Modelling One to One Denormalized (Better Option) { _id: "joe", name: "Joe Bookreader", address: { street: "123 Fake Street", city: "Faketon", state: "MA", zip: "12345" } }
  • 70. Data Modelling One to Many { _id: "joe", name: "Joe Bookreader" } { patron_id: "joe", street: "123 Fake Street", city: "Faketon", state: "MA", zip: "12345" } { patron_id: "joe", street: "1 Some Other Street", city: "Boston", state: "MA", zip: "12345" }
  • 71. Data Modelling One to Many Denormalized { _id: "joe", name: "Joe Bookreader", addresses: [ { street: "123 Fake Street", city: "Faketon", state: "MA", zip: "12345" }, { street: "1 Some Other Street", city: "Boston", state: "MA", zip: "12345" } ] } Save the number of queries Advantage
  • 72. Data Modelling One to Many Denormalized
  • 73. Data Modelling The Process of Data Modeling in MongoDB Data modeling comes with improved database performance, but at the expense of some considerations which include: ● Data retrieval patterns ● Balancing needs of the application such as: queries, updates and data processing ● Performance features of the chosen database engine ● The Inherent structure of the data itself
  • 74. Data Modelling MongoDB Document Structure Documents in MongoDB play a major role in the decision making over which technique to apply for a given set of data. There are generally two relationships between data, which are: ● Embedded Data ● Reference Data
  • 75. Data Modelling ● In this case, related data is stored within a single document either as a field value or an array within the document itself. The main advantage of this approach is that data is denormalized and therefore provides an opportunity for manipulating the related data in a single database operation. Consequently, this improves the rate at which CRUD operations are carried out, hence fewer queries are required. Let’s consider an example of a document below
  • 76. Data Modelling { "_id" : ObjectId("5b98bfe7e8b9ab9875e4c80c"), "StudentName" : "George Beckonn", "Settings" : { "location" : "Embassy", "ParentPhone" : 724765986 "bus" : "KAZ 450G", "distance" : "4", "placeLocation" : { "lat" : -0.376252, "lng" : 36.937389 } } }
  • 77. Data Modelling Strengths of Embedding 1. Increased data access speed: For an improved rate of access to data, embedding is the best option since a single query operation can manipulate data within the specified document with just a single database look-up. 2. Reduced data inconsistency: During operation, if something goes wrong (for example a network disconnection or power failure) only a few numbers of documents may be affected since the criteria often select a single document. 3. Reduced CRUD operations. This is to say, the read operations will actually outnumber the writes. Besides, it is possible to update related data in a single atomic write operation. I.e for the above data, we can update the phone number and also increase the distance with this single o
  • 78. Data Modelling db.students.updateOne({StudentName : "George Beckonn"}, { $set: {"ParentPhone" : 72436986}, $inc: {"Settings.distance": 1} })
  • 79. Data Modelling Weaknesses of Embedding 1. Restricted document size. All documents in MongoDB are constrained to the BSON size of 16 megabytes. Therefore, overall document size together with embedded data should not surpass this limit. Otherwise, for some storage engines such as MMAPv1, data may outgrow and result in data fragmentation as a result of degraded write performance. 2. Data duplication: multiple copies of the same data make it harder to query the replicated data and it may take longer to filter embedded documents, hence outdo the core advantage of embedding.
  • 80. Data Modelling Dot Notation The dot notation is the identifying feature for embedded data in the programming part. It is used to access elements of an embedded field or an array. In the sample data above, we can return information of the student whose location is “Embassy” with this query using the dot notation. db.users.find({'Settings.location': 'Embassy'})
  • 81. Data Modelling Flexible Schema A flexible schema in MongoDB defines that the documents not necessarily need to have the same fields or data type, for a field can differ across documents within a collection. The core advantage with this concept is that one can add new fields, remove existing ones or change the field values to a new type and hence update the document into a new structure. For example we can have these 2 documents in the same collection
  • 82. Data Modelling { "_id" : ObjectId("5b98bfe7e8b9ab9875e4c80c"), "StudentName" : "George Beckonn", "ParentPhone" : 75646344, "age" : 10 } { "_id" : ObjectId("5b98bfe7e8b9ab98757e8b9a"), "StudentName" : "Fredrick Wesonga", "ParentPhone" : false, }
  • 83. Data Modelling Example: Let’s insert the data below in a client collection.
  • 84. Data Modelling However, validation can also be applied to already existing documents. There are 3 levels of validation: 1. Strict: this is the default MongoDB validation level and it applies validation rules to all inserts and updates. 2. Moderate: The validation rules are applied during inserts, updates and to already existing documents that fulfill the validation criteria only. 3. Off: this level sets the validation rules for a given schema to null hence no validation will be done to the documents.
  • 85. Data Modelling db.clients.insert([ { "_id" : 1, "name" : "Brillian", "phone" : "+1 778 574 666", "city" : "Beijing", "status" : "Married" }, { "_id" : 2, "name" : "James", "city" : "Peninsula" } ]
  • 86. Data Modelling If we apply the moderate validation level using: db.runCommand( { collMod: "test", validator: { $jsonSchema: { bsonType: "object", required: [ "phone", "name" ], properties: { phone: { bsonType: "string", description: "must be a string and is required" }, name: { bsonType: "string", description: "must be a string and is required" } } } }, validationLevel: "moderate"
  • 87. Data Modelling Schema Validation Actions After doing validation on documents, there may be some that may violate the validation rules. There is always a need to provide an action when this happens. MongoDB provides two actions that can be issued to the documents that fail the validation rules: 1. Error: this is the default MongoDB action, which rejects any insert or update in case it violates the validation criteria. 2. Warn: This action will record the violation in the MongoDB log, but allows the insert or update operation to be completed. For example:
  • 88. Data Modelling db.createCollection("students", { validator: {$jsonSchema: { bsonType: "object", required: [ "name", "gpa" ], properties: { name: { bsonType: "string", description: "must be a string and is required" }, gpa: { bsonType: [ "double" ], minimum: 0, description: "must be a double and is required" } } }, validationAction: “warn” })
  • 89. Data Modelling db.students.insert( { name: "Amanda", status: "Updated" } ) The gpa is missing regardless of the fact that it is a required field in the schema design, but since the validation action has been set to warn, the document will be saved and an error message will be recorded in the MongoDB log. Validation action can not be set on admin,local and config databases.
  • 90. Data Modelling db.createCollection( "contacts5", { validator: { $jsonSchema: { bsonType: "object", required: [ "phone" ], properties: { phone: { bsonType: "string", description: "must be a string and is required" }, email: { bsonType : "string", pattern : "@mongodb.com$", description: "must be a string and match the regular expression pattern" }, status: { enum: [ "Unknown", "Incomplete" ], description: "can only be one of the enum values" } } } }, validationAction: "error" } )
  • 91. Data Modelling > db.contacts5.insert( { phone: "123" ,email: "abc@mongodb.com", status: "Unknown" } ) WriteResult({ "nInserted" : 1 }) > db.contacts5.insert( { phone: 123 ,email: "abc@mongodb.com", status: "Unknown" } ) WriteResult({ "nInserted" : 0, "writeError" : { "code" : 121, "errmsg" : "Document failed validation" } }) >
  • 94. Data Modelling DeNormalization The advantage of this is that you need one less query to get the information. The downside is that it takes up more space and is more difficult to keep in sync. For example, we decide that the light style should be renamed day. We would have to update every single document where the user.accountsPref.style was light.
  • 95. Data Modelling Join db.order1.insert([ { "_id" : 1, "item" : "almonds", "price" : 12, "quantity" : 2 }, { "_id" : 2, "item" : "pecans", "price" : 20, "quantity" : 1 }, { "_id" : 3 } ]) db.inventory1.insert([ { "_id" : 1, "sku" : "almonds", description: "product 1", "instock" : 120 }, { "_id" : 2, "sku" : "bread", description: "product 2", "instock" : 80 }, { "_id" : 3, "sku" : "cashews", description: "product 3", "instock" : 60 }, { "_id" : 4, "sku" : "pecans", description: "product 4", "instock" : 70 }, { "_id" : 5, "sku": null, description: "Incomplete" }, { "_id" : 6 } ])
  • 96. Data Modelling Join db.orders.aggregate([ { $lookup: { from: "inventory", localField: "item", foreignField: "sku", as: "inventory_docs" } } ])
  • 97. Data Modelling Join SQL equivalent is: SELECT *, stockdata FROM orders WHERE stockdata IN (SELECT warehouse, instock FROM warehouses WHERE stock_item= orders.item AND instock >= orders.ordered );
  • 98. Data Modelling Nested Pipeline db.orders2.insert([ { "_id" : 1, "item" : "almonds", "price" : 12, "ordered" : 2 }, { "_id" : 2, "item" : "pecans", "price" : 20, "ordered" : 1 }, { "_id" : 3, "item" : "cookies", "price" : 10, "ordered" : 60 } ])
  • 99. Data Modelling Nested Pipeline db.warehouses.insert([ { "_id" : 1, "stock_item" : "almonds", warehouse: "A", "instock" : 120 }, { "_id" : 2, "stock_item" : "pecans", warehouse: "A", "instock" : 80 }, { "_id" : 3, "stock_item" : "almonds", warehouse: "B", "instock" : 60 }, { "_id" : 4, "stock_item" : "cookies", warehouse: "B", "instock" : 40 }, { "_id" : 5, "stock_item" : "cookies", warehouse: "A", "instock" : 80 } ])
  • 100. Data Modelling Nested Pipeline db.orders2.aggregate([ { $lookup: { from: "warehouses", let: { order_item: "$item", order_qty: "$ordered" }, pipeline: [ { $match: { $expr: { $and: [ { $eq: [ "$stock_item", "$$order_item" ] }, { $gte: [ "$instock", "$$order_qty" ] } ] } } }, { $project: { stock_item: 0, _id: 0 } } ], as: "stockdata" } } ])
  • 101. Data Modelling Nested Pipeline Above is equivalent to below SQL statement SELECT *, stockdata FROM orders WHERE stockdata IN (SELECT warehouse, instock FROM warehouses WHERE stock_item= orders.item AND instock >= orders.ordered );
  • 103. Data Modelling Tree Structures
  • 104. Data Modelling Tree Structures Parent reference db.categories.insert( { _id: "MongoDB", parent: "Databases" } ) db.categories.insert( { _id: "dbm", parent: "Databases" } ) db.categories.insert( { _id: "Databases", parent: "Programming" } ) db.categories.insert( { _id: "Languages", parent: "Programming" } ) db.categories.insert( { _id: "Programming", parent: "Books" } ) db.categories.insert( { _id: "Books", parent: null } ) db.categories.findOne( { _id: "MongoDB" } ).parent You can create an index on the field parent to enable fast search by the parent node: db.categories.createIndex( { parent: 1 } )
  • 105. Data Modelling Tree Structures You can query by the parent field to find its immediate children nodes: db.categories.find( { parent: "Databases" } )
  • 106. Data Modelling Tree Structures child as reference db.categories4.insert( { _id: "MongoDB", children: [] } ) db.categories4.insert( { _id: "dbm", children: [] } ) db.categories4.insert( { _id: "Databases", children: [ "MongoDB", "dbm" ] } ) db.categories4.insert( { _id: "Languages", children: [] } ) db.categories4.insert( { _id: "Programming", children: [ "Databases", "Languages" ] } ) db.categories4.insert( { _id: "Books", children: [ "Programming" ] } )
  • 107. Data Modelling Tree Structures child as reference db.categories.findOne( { _id: "Databases" } ).children Create children as index for fast search db.categories.createIndex( { children: 1 } ) You can query for a node in the children field to find its parent node as well as its siblings: db.categories.find( { children: "MongoDB" } )
  • 108. Data Modelling Tree Structures child as reference db.categories5.insert( { _id: "MongoDB", ancestors: [ "Books", "Programming", "Databases" ], parent: "Databases" } ) db.categories5.insert( { _id: "dbm", ancestors: [ "Books", "Programming", "Databases" ], parent: "Databases" } ) db.categories5.insert( { _id: "Databases", ancestors: [ "Books", "Programming" ], parent: "Programming" } ) db.categories5.insert( { _id: "Languages", ancestors: [ "Books", "Programming" ], parent: "Programming" } ) db.categories5.insert( { _id: "Programming", ancestors: [ "Books" ], parent: "Books" } ) db.categories5.insert( { _id: "Books", ancestors: [ ], parent: null } )
  • 109. Data Modelling Tree Structures ancestors as reference db.categories.findOne( { _id: "MongoDB" } ).ancestors db.categories5.createIndex( { ancestors: 1 } ) You can query by the field ancestors to find all its descendants: db.categories5.find( { ancestors: "Programming" } )
  • 110. Data Modelling Tree Structures Materialized Path db.categories7.insert( { _id: "Books", path: null } ) db.categories7.insert( { _id: "Programming", path: ",Books," } ) db.categories7.insert( { _id: "Databases", path: ",Books,Programming," } ) db.categories7.insert( { _id: "Languages", path: ",Books,Programming," } ) db.categories7.insert( { _id: "MongoDB", path: ",Books,Programming,Databases," } ) db.categories7.insert( { _id: "dbm", path: ",Books,Programming,Databases," } )
  • 111. Data Modelling Tree Structures Materialized Path db.categories7.find().sort( { path: 1 } ) db.categories7.find( { path: /,Programming,/ } ) db.categories7.find( { path: /^,Books,/ } ) db.categories7.createIndex( { path: 1 } )
  • 112. Data Modelling Tree Structures Nested Path
  • 113. Data Modelling Tree Structures Nested Path db.categories6.insert( { _id: "Books", parent: 0, left: 1, right: 12 } ) db.categories6.insert( { _id: "Programming", parent: "Books", left: 2, right: 11 } ) db.categories6.insert( { _id: "Languages", parent: "Programming", left: 3, right: 4 } ) db.categories6.insert( { _id: "Databases", parent: "Programming", left: 5, right: 10 } ) db.categories6.insert( { _id: "MongoDB", parent: "Databases", left: 6, right: 7 } ) db.categories6.insert( { _id: "dbm", parent: "Databases", left: 8, right: 9 } ) Query to retrieve descendant: var databaseCategory = db.categories6.findOne( { _id: "Databases" } ); db.categories6.find( { left: { $gt: databaseCategory.left }, right: { $lt: databaseCategory.right } } );
  • 114. Data Modelling Operational Strategies { last_name : "Smith", best_score: 3.9 } Change it to : { lname : "Smith", score : 3.9 } And save 9 bytes
  • 115. Data Modelling Operational Strategies Data Lifecycle Management Data modeling decisions should take data lifecycle management into consideration. The Time to Live or TTL feature of collections expires documents after a period of time. Consider using the TTL feature if your application requires some data to persist in the database for a limited period of time. Additionally, if your application only uses recently inserted documents, consider Capped Collections. Capped collections provide first-in-first-out (FIFO) management of inserted documents and efficiently support operations that insert and read documents based on insertion order.
  • 116. MongoDB backup Strategies Backup Strategy #1: mongodump Backup Strategy #2: Copying the Underlying Files For example, Linux LVM quickly and efficiently creates a consistent snapshot of the file system that can be copied for backup and restore purposes. To ensure that the snapshot is logically consistent, you must have journaling enabled within MongoDB. Backup Strategy #3: MongoDB Management Service (MMS) MongoDB Management Service provides continuous, online backup for MongoDB as a fully managed service. You install the Backup Agent in your environment, which conducts an initial sync to MongoDB’s secure and redundant datacenters. After the initial sync, MMS streams encrypted and compressed MongoDB oplog data to MMS so that you have a continuous backup.
  • 117. MongoDB Monitoring When it comes to MongoDB monitoring, some of the important metrics to monitor are: ● Performance stats ● Utilization of resources (CPU usage, available memory and Network usage) ● Assert stats ● Replication stats ● Saturation of resources ● Throughput operations Applications Manager MongoDB monitoring service supports all versions of MongoDB up to version 4.0.2.
  • 118. MongoDB Relationship User { "_id":ObjectId("52ffc33cd85242f436000001"), "name": "Tom Hanks", "contact": "987654321", "dob": "01-01-1991" }
  • 119. MongoDB Relationship Address { "_id":ObjectId("52ffc4a5d85242602e000000"), "building": "22 A, Indiana Apt", "pincode": 123456, "city": "Los Angeles", "state": "California" }
  • 120. MongoDB Relationship Modelling Embedded Relationship { "_id":ObjectId("52ffc33cd85242f436000001"), "contact": "987654321", "dob": "01-01-1991", "name": "Tom Benzamin", "address": [ { "building": "22 A, Indiana Apt", "pincode": 123456, "city": "Los Angeles", "state": "California"
  • 121. MongoDB Relationship This approach maintains all the related data in a single document, which makes it easy to retrieve and maintain. The whole document can be retrieved in a single query such as − >db.users.findOne({"name":"Tom Benzamin"},{"address":1})
  • 123. MongoDB Scalability The primary node receives all write operations. A replica set can have only one primary capable of confirming writes with { w: "majority" } write concern; although in some circumstances, another mongod instance may transiently believe itself to also be primary. [1] The primary records all changes to its data sets in its operation log, i.e. oplog. For more information on primary node operation, see Replica Set Primary. The secondaries replicate the primary’s oplog and apply the operations to their data sets such that the secondaries’ data sets reflect the primary’s data set. If the primary is unavailable, an eligible secondary will hold an election to elect itself the new primary. For more information on secondary members, see Replica Set Secondary Members.
  • 126. MongoDB Scalability An arbiter will always be an arbiter whereas a primary may step down and become a secondary and a secondarymay become the primary during an election.
  • 127. MongoDB Scalability Failover When a primary does not communicate with the other members of the set for more than the configured electionTimeoutMillis period (10 seconds by default), an eligible secondary calls for an election to nominate itself as the new primary. The cluster attempts to complete the election of a new primary and resume normal operations.
  • 128. MongoDB Scalability Failover Read Preference By default, clients read from the primary [1]; however, clients can specify a read preference to send read operations to secondaries.
  • 130. MongoDB Replicaset mkdir -p rs1 rs2 rs3 Localhost replica mongod --replSet mongo1 --logpath “rs1.log” --dbpath rs1 --port 27017 & mongod --replSet mongo1 --logpath “rs2.log” --dbpath rs2 --port 27018 & mongod --replSet mongo1 --logpath “rs3.log” --dbpath rs3 --port 27019 & config = { _id: "mongoRs", members:[ {_id : 0,host : "localhost:27017"}, {_id : 1,host : "localhost:27017"}] }; rs.initiate(config) rs.status()
  • 131. MongoDB Replicaset Do the below changes in all the node: ➢ /etc/mongodb.conf file replSet = mongoRs /etc/hosts file Restart mongodb $mongo --host 192.168.64.24 config = { _id: "mongoRs", members:[ {_id : 0,host : "node1:27017"}, {_id : 1,host : "node2:27017"}] };
  • 132. MongoDB Replicaset mkdir -p rs1 rs2 rs3 mongod --replSet mongo1 --logpath “rs1.log” --dbpath rs1 --port 27017 & mongod --replSet mongo1 --logpath “rs2.log” --dbpath rs2 --port 27018 & mongod --replSet mongo1 --logpath “rs3.log” --dbpath rs3 --port 27019 &
  • 133. MongoDB Sharding sh.addTagRange("records.users", { zipcode: "10001" }, { zipcode: "10281" }, "NYC") sh.addTagRange("records.users", { zipcode: "11201" }, { zipcode: "11240" }, "NYC") sh.addTagRange("records.users", { zipcode: "94102" }, { zipcode: "94135" }, "SFO")
  • 134. MongoDB Sharding Horizontal Partitioning . Storing rows in multiple database.
  • 136. MongoDB Algorithmic Sharding Algorithmically sharded databases use a sharding function (partition_key) -> database_id to locate data. A simple sharding function may be “hash(key) % NUM_DB”.
  • 141. MongoDB Hierarchical Sharding For example, if the shard key is: { a: 1, b: 1, c: 1 } The mongos program can route queries that include the full shard key or either of the following shard key prefixes at a specific shard or set of shards: { a: 1 } { a: 1, b: 1 }
  • 142. MongoDB Shard Collections Distribution All insertOne() operations target to one shard. Each document in the insertMany() array targets to a single shard, but there is no guarantee all documents in the array insert into a single shard. All updateOne(), replaceOne() and deleteOne() operations must include the shard key or _id in the query document. MongoDB returns an error if these methods are used without the shard key or _id. Depending on the distribution of data in the cluster and the selectivity of the query, mongos may still perform a broadcast operation to fulfill these queries .
  • 143. MongoDB Components of Sharding The components of a Shard include 1. A Shard – This is the basic thing, and this is nothing but a MongoDB instance which holds the subset of the data. In production environments, all shards need to be part of replica sets. 2. Config server – This is a mongodb instance which holds metadata about the cluster, basically information about the various mongodb instances which will hold the shard data. 3. A Router – This is a mongodb instance which basically is responsible to re-directing the commands send by the client to the right servers.
  • 144. MongoDB Components of Sharding & Setup Setup Requirements We require the following servers for Mongodb Sharding setup: Query server – Server A Config server & Shards / Replica Set – Server B Config server & Shards / Replica Set – Server C Config server & Shards / Replica Set – Server D
  • 145. MongoDB Components of Sharding Run in All servers $ apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10 $ echo "deb http://repo.mongodb.org/apt/ubuntu "$(lsb_release -sc)"/mongodb-org/3.2 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-3.2.list $ sudo apt-get update $ sudo apt-get install -y mongodb-org Skip Above steps if mongodb is already installed and apt is setup.
  • 146. MongoDB Query Router Setup echo "mongodb-org hold" | sudo dpkg --set-selections $ echo "mongodb-org-server hold" | sudo dpkg --set-selections $ echo "mongodb-org-shell hold" | sudo dpkg --set-selections $ echo "mongodb-org-mongos hold" | sudo dpkg --set-selections $ echo "mongodb-org-tools hold" | sudo dpkg --set-selections MongoDB Sharding Config Server Setup Run in Servers B, C and D $ mkdir -p /var/mongodb/mongo-metadata $ mongod --configsvr --dbpath /var/mongodb/mongo-metadata --port 27019
  • 147. Enabling Sharding Mongo Shards Replica set setup in Servers B, C and D $ mkdir -p /var/mongodb/store/rs1 $ mkdir -p /var/mongodb/store/rs2 $ mkdir -p /var/mongodb/store/rs3
  • 148. Enabling Sharding Run mongoDB shard replicaset $ mongod --shardsvr --replSet rs1 --dbpath /var/mongodb/store/rs1 --port 27000 $ mongod --shardsvr --replSet rs2 --dbpath /var/mongodb/store/rs2 --port 27001 $ mongod --shardsvr --replSet rs1 --dbpath /var/mongodb/store/rs3 --port 27001
  • 149. Sharding Steps In Mongo Console $mongo --port 27000 mongo> rs.initiate() mongo> rs.add("shard.host.com:27001") mongo> rs.add("shard.host.com:27002")
  • 150. MongoDb Sharding Query Router setup MongoDB Sharding Query Routers setup in Server A $sudo service mongodb stop In order to start the query router service: # Using this same (following) command you can start multiple query servers $ mongos --configdb config0.host.com:27019,config1.host.com:27019,config2.host.com:27019
  • 151. Add Shard Server in Query Router In Server A $ mongo --host query0.example.com --port 27017 mongo> sh.addShard( "rep_set_name/rep_set_member:27017" ) $ mongo --port 27000 mongo> rs.status()
  • 152. Enabling Sharding for a Database and Collection $ mongo --host query0.example.com --port 27017 mongo> sh.stopBalancer() #Enabling Sharding to the Database mongo> use current_DB mongo> sh.enableSharding("#{current_DB}") Enabling Sharding to the Collection mongo> use current_DB mongo> sh.shardCollection( "current_DB.test_collection", { "_id" : "shard key" } )
  • 153. MongoDB Indexes What Problem do Indexes solve? Slow Queries Without indexes, MongoDB must perform a collection scan, i.e. scan every document in a collection, to select those documents that match the query statement. If an appropriate index exists for a query, MongoDB can use the index to limit the number of documents it must inspect.
  • 155. MongoDB Indexes db.collection.createIndex( <key and index type specification>, <options> ) db.collection.createIndex( { name: -1 } ) You can create indexes with a custom name, such as one that is more human-readable than the default. For example, consider an application that frequently queries the products collection to populate data on existing inventory. The following createIndex() method creates an index on item and quantity named query forinventory db.products.createIndex( { item: 1, quantity: -1 } , { name: "query for inventory" } )
  • 156. MongoDB Indexes Types Single Field In addition to the MongoDB-defined _id index, MongoDB supports the creation of user-defined ascending/descending indexes on a single field of a document.
  • 157. MongoDB Indexes Types Compound Index MongoDB also supports user-defined indexes on multiple fields, i.e. compound indexes. The order of fields listed in a compound index has significance. For instance, if a compound index consists of {userid: 1, score: -1 }, the index sorts first by userid and then, within each userid value, sorts by score.
  • 158. MongoDB Indexes Types Multikey Index MongoDB uses multikey indexes to index the content stored in arrays. If you index a field that holds an array value, MongoDB creates separate index entries for every element of the array. These multikey indexes allow queries to select documents that contain arrays by matching on element or elements of the arrays. MongoDB automatically determines whether to create a multikey index if the indexed field contains an array value; you do not need to explicitly specify the multikey type.
  • 160. MongoDB Indexes Types Geospatial Index To support efficient queries of geospatial coordinate data, MongoDB provides two special indexes: 2d indexesthat uses planar geometry when returning results and 2dsphere indexes that use spherical geometry to return results. See 2d Index Internals for a high level introduction to geospatial indexes.
  • 161. MongoDB Indexes Types Text Indexes MongoDB provides a text index type that supports searching for string content in a collection. These text indexes do not store language-specific stop words (e.g. “the”, “a”, “or”) and stem the words in a collection to only store root words.
  • 162. MongoDB Indexes Types Hashed Indexes To support hash based sharding, MongoDB provides a hashed index type, which indexes the hash of the value of a field. These indexes have a more random distribution of values along their range, but only support equality matches and cannot support range-based queries.
  • 163. MongoDB Indexes Properties Unique Indexes The unique property for an index causes MongoDB to reject duplicate values for the indexed field. Other than the unique constraint, unique indexes are functionally interchangeable with other MongoDB indexes. Partial Indexes New in version 3.2. Partial indexes only index the documents in a collection that meet a specified filter expression. By indexing a subset of the documents in a collection, partial indexes have lower storage requirements and reduced performance costs for index creation and maintenance. Partial indexes offer a superset of the functionality of sparse indexes and should be preferred over sparse indexes
  • 164. MongoDB Indexes Properties Sparse Indexes The sparse property of an index ensures that the index only contain entries for documents that have the indexed field. The index skips documents that do not have the indexed field. You can combine the sparse index option with the unique index option to prevent inserting documents that have duplicate values for the indexed field(s) and skip indexing documents that lack the indexed field(s). TTL Indexes TTL indexes are special indexes that MongoDB can use to automatically remove documents from a collection after a certain amount of time. This is ideal for certain types of information like machine generated event data, logs, and session information that only need to persist in a database for a finite amount of time.
  • 165. MongoDB Indexes Uses Indexes can improve the efficiency of read operations. The Analyze Query Performance tutorial provides an example of the execution statistics of a query with and without an index. For information on how MongoDB chooses an index to use, see query optimizer.
  • 166. MongoDB Indexes & Collation To use an index for string comparisons, an operation must also specify the same collation. That is, an index with a collation cannot support an operation that performs string comparisons on the indexed fields if the operation specifies a different collation. For example, the collection myColl has an index on a string field category with the collation locale "fr". db.myColl.createIndex( { category: 1 }, { collation: { locale: "fr" } } ) db.myColl.find( { category: "cafe" } ).collation( { locale: "fr" } ) db.myColl.find( { category: "cafe" } )
  • 167. MongoDB Indexes & Collation For example, the collection myColl has a compound index on the numeric fields score and price and the string field category; the index is created with the collation locale "fr" for string comparisons: db.myColl.createIndex( { score: 1, price: 1, category: 1 }, { collation: { locale: "fr" } } ) The following operations, which use "simple" binary collation for string comparisons, can use the index: db.myColl.find( { score: 5 } ).sort( { price: 1 } ) db.myColl.find( { score: 5, price: { $gt: NumberDecimal( "10" ) } } ).sort( { price: 1 } )
  • 168. MongoDB Indexes & Collation The following operation, which uses "simple" binary collation for string comparisons on the indexed categoryfield, can use the index to fulfill only the score: 5 portion of the query: db.myColl.find( { score: 5, category: "cafe" } )
  • 170. MongoDB Operations that support collation Operations that Support Collation All reading, updating, and deleting methods support collation. Some examples are listed below. find() and sort() Individual queries can specify a collation to use when matching and sorting results. The following query and sort operation uses a German collation with the locale parameter set to de.
  • 171. MongoDB Aggregation Types ● Pipeline ● MapReduce ● Single Purpose
  • 173. MongoDB Aggregation Pipeline ● Pipeline ● Pipeline Expressions ● Aggregation Pipeline Behavior ● Considerations
  • 174. MongoDB Aggregation Pipeline db.orders.insertMany([ { item: "journal", amount: 25, size: { h: 14, w: 21, uom: "cm" }, status: "A" }, { item: "notebook", amount: 50, size: { h: 8.5, w: 11, uom: "in" }, status: "A" }, { item: "paper", amount: 100, size: { h: 8.5, w: 11, uom: "in" }, status: "D" }, { item: "planner", amount: 75, size: { h: 22.85, w: 30, uom: "cm" }, status: "D" }, { item: "postcard", amount: 45, size: { h: 10, w: 15.25, uom: "cm" }, status: "A" } ]);
  • 175. MongoDB Aggregation Pipeline db.orders.aggregate([ { $match: { status: "A" } }, { $group: { _id: "$cust_id", total: { $sum: "$amount" } } } ]) *This works as Unix Pipes First Stage: The $match stage filters the documents by the status field and passes to the next stage those documents that have status equal to "A". Second Stage: The $group stage groups the documents by the cust_id field to calculate the sum of the amount for each unique cust_id.
  • 176. MongoDB Aggregation Pipeline The MongoDB aggregation pipeline consists of stages. Each stage transforms the documents as they pass through the pipeline. Pipeline stages do not need to produce one output document for every input document; e.g., some stages may generate new documents or filter out documents. Pipeline stages can appear multiple times in the pipeline with the exception of $out, $merge, and $geoNearstages. For a list of all available stages, see Aggregation Pipeline Stages. MongoDB provides the db.collection.aggregate() method in the mongo shell and the aggregatecommand to run the aggregation pipeline. For example usage of the aggregation pipeline, consider Aggregation with User Preference Data and Aggregation with the Zip Code Data Set.
  • 177. MongoDB Aggregation Single Purpose > db.orde.insert({cust_id : "A123" , amount : 500 , status : "A" } , { cust_id : "A123" , amount:250 , status: "A"}, { cust_id : "B212" , amount:200 , status: "A"} , { cust_id : "A123" , amount:300 , status: "D"} ) WriteResult({ "nInserted" : 1 }) > > db.ord.distinct("cust_id") [ "A123" ]
  • 178. MongoDB Performance ● Locking Performance ● Number of Connections ● Database Profiling ● Full Time Diagnostic Data Capture
  • 179. MongoDB Perfomance Number of Connections In some cases, the number of connections between the applications and the database can overwhelm the ability of the server to handle requests. The following fields in the serverStatus document can provide insight: ● connections is a container for the following two fields: ○ connections.current the total number of current clients connected to the database instance. ○ connections.available the total number of unused connections available for new clients. If there are numerous concurrent application requests, the database may have trouble keeping up with demand. If this is the case, then you will need to increase the capacity of your deployment.
  • 180. MongoDB Perfomance Database Profiling The Database Profiler collects detailed information about operations run against a mongod instance. The profiler’s output can help to identify inefficient queries and operations. You can enable and configure profiling for individual databases or for all databases on a mongod instance. Profiler settings affect only a single mongod instance and will not propagate across a replica set or sharded cluster. See Database Profiler for information on enabling and configuring the profiler.
  • 181. MongoDB Perfomance The following profiling levels are available: Level Description 0 The profiler is off and does not collect any data. This is the default profiler level. 1 The profiler collects data for operations that take longer than the value of slowms. 2 The profiler collects data for all operations.
  • 182. MongoDB Perfomance - DB profiling Enable and Configure Database Profiling This section uses the mongo shell helper db.setProfilingLevel() helper to enable profiling. For instructions using the driver, see your driver documentation. When you enable profiling for a mongod instance, you set the profiling level to a value greater than 0. The profiler records data in the system.profile collection. MongoDB creates the system.profile collection in a database after you enable profiling for that database. To enable profiling and set the profiling level, pass the profiling level to the db.setProfilingLevel() helper. For example, to enable profiling for all database operations, consider the following operation in the mongo shell:
  • 183. MongoDB Performance - DB profiling db.setProfilingLevel(2) { "was" : 0, "slowms" : 100, "sampleRate" : 1.0, "ok" : 1 }
  • 184. MongoDB Performance - DB profiling Specify the Threshold for Slow Operations By default, the slow operation threshold is 100 milliseconds. To change the slow operation threshold, specify the desired threshold value in one of the following ways: ● Set the value of slowms using the profile command or db.setProfilingLevel() shell helper method. ● Set the value of --slowms from the command line at startup. ● Set the value of slowOpThresholdMs in a configuration file. For example, the following code sets the profiling level for the current mongod instance to 1 and sets the slow operation threshold for the mongod instance to 20 milliseconds: db.setProfilingLevel(1, { slowms: 20 }) Profiling level of 1 will profile operations slower than the threshold.
  • 185. MongoDB Performance - DB profiling IMPORTANT The slow operation threshold applies to all databases in a mongod instance. It is used by both the database profiler and the diagnostic log and should be set to the highest useful value to avoid performance degradation.
  • 186. MongoDB Performance - DB profiling Check Profiling Level To view the profiling level, issue the following from the mongo shell: db.getProfilingStatus() To enable profiling for a mongod instance, pass the following options to mongod at startup. $mongod --profile 1 --slowms 15 --slowOpSampleRate 0.5
  • 187. Example Data profiler Queries This section displays example queries to the system.profile collection. For an explanation of the query output, see Database Profiler Output. To return the most recent 10 log entries in the system.profile collection, run a query similar to the following: >db.system.profile.find().limit(10).sort( { ts : -1 } ).pretty() >db.system.profile.find( { op: { $ne : 'command' } } ).pretty()
  • 188. Example Data profiler Queries To return operations for a particular collection, run a query similar to the following. This example returns operations in the mydb database’s test collection: >db.system.profile.find( { ns : 'mydb.test' } ).pretty() >db.system.profile.find( { millis : { $gt : 5 } } ).pretty()
  • 189. Example Data profiler Queries based on time To return operations for a particular collection, run a query similar to the following. This example returns operations in the mydb database’s test collection. >db.system.profile.find( { ns : 'mydb.test' } ).pretty() To return operations slower than 5 milliseconds, run a query similar to the following: >db.system.profile.find( { millis : { $gt : 5 } } ).pretty()
  • 190. Example Data profiler Queries based on time To return information from a certain time range, run a query similar to the following: db.system.profile.find({ ts : { $gt: new ISODate("2019-11-09T03:00:00Z"), $lt: new ISODate("2012-11-09T03:40:00Z") } }).pretty()
  • 191. Example Data create new profile For example, to create a new system.profile collections that’s 4000000 bytes, use the following sequence of operations in the mongo shell: db.setProfilingLevel(0) db.system.profile.drop() db.createCollection( "system.profile", { capped: true, size:4000000 } ) db.setProfilingLevel(1)
  • 193. MongoDB security Role Based Security Role-Based Access Control Enable Access Control Manage Users and Roles
  • 194. MongoDB security TLS/SSL (Transport Encryption) Configure mongod and mongos for TLS/SSL TLS/SSL Configuration for Clients
  • 195. MongoDB security Encrypt Communication Configure MongoDB to use TLS/SSL for all incoming and outgoing connections. Use TLS/SSL to encrypt communication between mongod and mongos components of a MongoDB deployment as well as between all applications and MongoDB. Starting in version 4.0, MongoDB uses the native TLS/SSL OS libraries:
  • 196. MongoDB security Start MongoDB without access control. mongod --port 27017 --dbpath /var/lib/mongodb Connect to Instance mongo --port 27017
  • 197. MongoDB security use admin db.createUser( { user: "myUserAdmin", pwd: passwordPrompt(), // or cleartext password roles: [ { role: "userAdminAnyDatabase", db: "admin" }, "readWriteAnyDatabase" ] } )
  • 198. MongoDB security Re-start the MongoDB instance with access control. db.adminCommand( { shutdown: 1 } ) From the terminal, re-start the mongod instance with the --auth command line option or, if using a configuration file, the security.authorization setting. mongod --auth --port 27017 --dbpath /var/lib/mongodb
  • 199. MongoDB security Start a mongo shell with the -u <username>, -p, and the --authenticationDatabase<database> command line options: $mongo --host 192.168.1.103 --port 27017 --authenticationDatabase "admin" -u "myUserAdmin" -p “qwerty”
  • 200. MongoDB security use test db.createUser( { user: "myTester", pwd: passwordPrompt(), // or cleartext password roles: [ { role: "readWrite", db: "test" }, { role: "read", db: "reporting" } ] } )
  • 201. MongoDB security mongo --host 192.168.1.103 --port 27017 -u "myTester" --authenticationDatabase "test" -p “asdfg” db.foo.insert( { x: 1, y: 1 } )
  • 202. MongoDB Authentication ● Authentication Methods ● Authentication Mechanisms ● Internal Authentication ● Authentication on Sharded Clusters
  • 203. MongoDB Authentication ● use reporting ● db.createUser( ● { ● user: "reportsUser", ● pwd: passwordPrompt(), // or cleartext password ● roles: [ ● { role: "read", db: "reporting" }, ● { role: "read", db: "products" }, ● { role: "read", db: "sales" }, ● { role: "readWrite", db: "accounts" } ● ] ● } ● )
  • 204. MongoDB LDAP Authentication Users that will authenticate to MongoDB using an external authentication mechanism, such as LDAP, must be created in the $external database, which allows mongos or mongod to consult an external source for authentication. Changed in version 3.6.3: To use sessions with $external authentication users (i.e. Kerberos, LDAP, x.509 users), the usernames cannot be greater than 10k bytes. For LDAP authentication, you must specify a username. You do not need to specify the password, as that is handled by the LDAP service. The following operation adds the reporting user with read-only access to the records database.