2. MongoDB Overview
MongoDB is a cross-platform document-oriented database program. Classified as a NoSQL
database program, MongoDB uses JSON-like documents with schema. MongoDB is
developed by MongoDB Inc. and licensed under the Server Side Public License.
EMPLOYEE
NAME ID COMPANY
3. NoSQL DB Overview
Unlike MySQL and Oracle Databases.
NoSQL Deals in files .
NoSQL, which stand for "not only SQL," is an alternative to traditional relational databases in which
data is placed in tables and data schema is carefully designed before the database is built. NoSQL
databases are especially useful for working with large sets of distributed data.
4. JSON Introduction
JSON (JavaScript Object Notation) is a lightweight data-interchange format. It is easy for humans to
read and write. It is easy for machines to parse and generate. It is based on a subset of the JavaScript
Programming Language Standard ECMA-262 3rd Edition - December 1999.
.
6. JSON Data types
At the granular level, JSON consist of 6 data types. First four data types (string, number, boolean and null)
can be referred as simple data types. Other two data types (object and array) can be referred as complex
data types.
1. string
2. number
3. boolean
4. null/empty
5. object
6. array
7.
8. MongoDB Overview
Document Database
A record in MongoDB is a document, which is a data structure composed of field and value pairs. MongoDB
documents are similar to JSON objects. The values of fields may include other documents, arrays, and
arrays of documents.
9. MongoDB Collections/Views
MongoDB stores documents in collections. Collections are analogous to tables in relational databases.
In addition to collections, MongoDB supports:
● Read-only Views (Starting in MongoDB 3.4)
● On-Demand Materialized Views (Starting in MongoDB 4.2
10. MongoDB Overview
MongoDB is free and open source database.
It is a NoSQL database use Json like schema.
It is a cross platform database and very easy to deploy on the cloud and server.
It is of the most important databases you can work with these days.
MongoDB makes working with data simple.
It prioritizes performance and efficiency.
It is very popular. MongoDB developers are in high demand.
11. MongoDB CRUD
● How to perform CRUD (Create, Read, Update, Delete) operations on
MongoDB databases
15. MongoDB Delete
Delete operations remove documents from a collection. MongoDB provides the following methods to delete
documents of a collection:
● db.collection.deleteOne() New in version 3.2
● db.collection.deleteMany() New in version 3.2
16. MongoDB Objectives
● How to perform CRUD (Create, Read, Update, Delete) operations on MongoDB databases
● How to filter for data efficiently
● How to work with both the Mongo Shell and drivers (e.g. Node.js driver)
● How to increase performance by using indexes (and how to use the right indexes!)
● How to use the amazing “Aggregation Framework” that’s built into MongoDB
● What replica sets and sharding are
● How to use MongoDB Atlas — the cloud solution offered by MongoDB
● How to use the serverless platform (Stitch) offered by MongoDB
17. MongoDB Advantages
● Schema less − MongoDB is a document database in which one collection holds different documents. Number of fields,
content and size of the document can differ from one document to another.
● Structure of a single object is clear.
● No complex joins.
● Deep query-ability. MongoDB supports dynamic queries on documents using a document-based query language that's nearly
as powerful as SQL.
● Tuning.
● Ease of scale-out − MongoDB is easy to scale.
● Conversion/mapping of application objects to database objects not needed.
● Uses internal memory for storing the (windowed) working set, enabling faster access of data.
● Scalability
●
18. MongoDB Objectives
● Download and install MongoDB
● Modify environment variables
● Start and stop mongoDB server
● Connect to MongoDB using python
● Perform CRUD operations
19. MongoDB Objectives
● Create a database
● Create a collection
● Insert documents
● Combine collections
● Import data into mongoDB
● Backup and restore mongoDB
● Create indexes
20. Why Use MongoDB?
● Document Oriented Storage − Data is stored in the form of JSON style documents.
● Index on any attribute
● Replication and high availability
● Auto-sharding
● Rich queries
● Fast in-place updates
● Professional support by MongoDB
21. Where to use MongoDB?
● Big Data
● Content Management and Delivery
● Mobile and Social Infrastructure
● User Data Management
● Data Hub
22. Where to use MongoDB?
● Big Data
● Content Management and Delivery
● Mobile and Social Infrastructure
● User Data Management
● Data Hub
23. MongoDB Data Modelling
Suppose a client needs a database design for his blog/website and see the differences between RDBMS and MongoDB
schema design. Website has the following requirements.
● Every post has the unique title, description and url.
● Every post can have one or more tags.
● Every post has the name of its publisher and total number of likes.
● Every post has comments given by users along with their name, message, data-time and likes.
● On each post, there can be zero or more comments
27. MongoDB Conf file and Settings
/etc/mongoDB.conf
A given mongo database is broken up into a series of BSON files on disk, with increasing size up to 2GB. BSON is its
own format, built specifically for MongoDB.
28. MongoDB login
$ mongo
Will try to connect to localhost
Using ip address and port
$ mongo --host 192.168.64.24:27017
Or
$mongo --host 192.168.64.24 --port 27017
29. MongoDB User Creation
1. Set up your user
use cool_db
db.createUser({
user: 'john',
pwd: 'secretPassword',
roles: [{ role: 'readWrite', db:'test'}]
})
30. MongoDB User Creation
mongo -u john -p secretPassword 192.168.64.24/test
MongoDB shell version v3.6.3
connecting to: mongodb://192.168.64.24:27017/test
MongoDB server version: 3.6.3
> db.inv.insert({"name":"abc1", "add":"street3"})
WriteResult({ "nInserted" : 1 })
31. MongoDB User Creation
1. Set up your user for read only
use cool_db
db.createUser({
user: 'Sya',
pwd: 'secretPassword',
roles: [{ role: 'read', db:'test'}]
})
32. MongoDB User Creation
$ mongo -u Syam -p secretPassword 192.168.64.24/test
MongoDB shell version v3.6.3
connecting to: mongodb://192.168.64.24:27017/test
MongoDB server version: 3.6.3
> db.inv.insert({"name":"abc1", "add":"street3"})
WriteResult({
"writeError" : {
"code" : 13,
"errmsg" : "not authorized on test to execute command { insert: "inv", ordered:
true, $db: "test" }"
}
})
35. MongoDB security authorization
Create a user before uncommenting auth=true in mongodb.conf
db.createUser(
... {
... user: "myUserAdmin",
... pwd: "abc123",
... roles: [ { role: "userAdminAnyDatabase", db: "admin" } ]
... }
... )
Again uncomment the auth=true and restart mongodb
mongo --host 192.168.64.24 --port 27017 -u myUserAdmin -p --authenticationDatabase admin
36. MongoDB Create DB
MongoDB use DATABASE_NAME is used to create database. The command will create a new database if it
doesn't exist, otherwise it will return the existing database.
If you didn't create any database, then collections will be stored in test database.
>use mydb
switched to db mydb
>db.dropDatabase()
39. MongoDB Create Collection
The cool thing about MongoDB is that you need not to create collection before you insert document in it.
With a single command you can insert a document in the collection and the MongoDB creates that
collection on the fly.
Syntax: db.collection_name.insert({key:value, key:value…})
42. MongoDB Collection insert methods
db.collection.insertOne()
Inserts a single document into a collection.
db.collection.insertMany()
db.collection.insertMany() inserts multiple documents into a collection
db.collection.insert()
db.collection.insert() inserts a single document or multiple documents into a collection
44. MongoDB Collection insert multiple
We can pass array of document also
:
db.post.insert([
{
title: 'MongoDB Overview',
description: 'MongoDB is no sql database',
by: 'tutorials point',
url: 'http://www.tutorialspoint.com',
tags: ['mongodb', 'database', 'NoSQL'],
likes: 100
},
{
title: 'NoSQL Database',
description: "NoSQL database doesn't have tables",
by: 'tutorials point',
url: 'http://www.tutorialspoint.com',
tags: ['mongodb', 'database', 'NoSQL'],
likes: 20,
comments: [
{
user:'user1',
message: 'My first comment',
dateCreated: new Date(2013,11,10,2,35),
like: 0
}
]
45. MongoDB Collection find method
db.COLLECTION_NAME.find()
find() method will display all the documents in a non-structured way.
The pretty() Method
To display the results in a formatted way, you can use pretty() method.
Syntax
>db.mycol.find().pretty()
49. MongoDB Collection select data
db.inventory.find( {} )
This operation corresponds to the following SQL statement:
SELECT * FROM inventory
db.inventory.find( { status : "D" } )
This operation corresponds to the following SQL statement:
SELECT * FROM inventory WHERE status = "D"
50. MongoDB Collection select data
The following example retrieves all documents from the inventory collection where status equals either "A"or
"D":
db.inventory.find( { status: { $in: [ "A", "D" ] } } )
Select * from inventory where status in ( “A”,”D”)
51. MongoDB Collection select data
The following example retrieves all documents from the inventory collection where status equals either "A"or
"D":
db.inventory.find( { status: { $nin: [ "A", "D" ] } } )
Select * from inventory where status not in ( “A”,”D”)
52. MongoDB Collection select data
#count the number of resultant rows
db.inventory.count( { status: { $nin: [ "A", "D" ] } } )
db.inventory.find( { status: "A", qty: { $lt: 30 } } )
The operation corresponds to the following SQL statement:
SELECT * FROM inventory WHERE status = "A" AND qty < 30
53. OR in MongoDB
Using the $or operator, you can specify a compound query that joins each clause with a logical OR conjunction so
that the query selects the documents in the collection that match at least one condition.
db.inventory.find( { $or: [ { status: "A" }, { qty: { $lt: 30 } } ] } )
SELECT * FROM inventory WHERE status = "A" OR qty < 30
db.inventory.find( { $and: [ { status: "A" }, { qty: { $lt: 30 } } ] } )
55. OR in MongoDB
To query documents based on the OR condition, you need to use $or keyword. Following is the basic syntax of OR −
>db.mycol.find(
{
$or: [
{key1: value1}, {key2:value2}
]
}
).pretty()
Using the $or operator, you can specify a compound query that joins each clause with a logical OR conjunction so
that the query selects the documents in the collection that match at least one condition.
56. OR in MongoDB
Following example will show all the tutorials written by 'tutorials point' or whose title is 'MongoDB Overview'.
>db.mycol.find({$or:[{"by":"tutorials point"},{"title": "MongoDB Overview"}]}).pretty()
{
"_id": ObjectId(7df78ad8902c),
"title": "MongoDB Overview",
"description": "MongoDB is no sql database",
"by": "tutorials point",
"url": "http://www.tutorialspoint.com",
"tags": ["mongodb", "database", "NoSQL"],
"likes": "100"
}
>
57. AND and OR in MongoDB
where likes>10 AND (by = 'tutorials point' OR title = 'MongoDB Overview')'
>db.mycol.find({"likes": {$gt:10}, $or: [{"by": "tutorials point"},
{"title": "MongoDB Overview"}]}).pretty()
{
"_id": ObjectId(7df78ad8902c),
"title": "MongoDB Overview",
"description": "MongoDB is no sql database",
"by": "tutorials point",
"url": "http://www.tutorialspoint.com",
"tags": ["mongodb", "database", "NoSQL"],
"likes": "100"
}
59. insertMany check null
db.inventory.insertMany([
{ _id: 1, item: null },
{ _id: 2 }
])
db.inventory.find( { item: null } )
The { item : { $type: 10 } } query matches only documents that contain the item field whose value
isnull; i.e. the value of the item field is of BSON Type Null (type number 10) :
db.inventory.find( { item : { $type: 10 } } )
64. MongoDB Collection Update methods
The following methods can also add new documents to a collection:
● db.collection.update() when used with the upsert: true option.
● db.collection.updateOne() when used with the upsert: true option.
● db.collection.updateMany() when used with the upsert: true option.
● db.collection.findAndModify() when used with the upsert: true option.
● db.collection.findOneAndUpdate() when used with the upsert: true option.
● db.collection.findOneAndReplace() when used with the upsert: true option.
● db.collection.save().
● db.collection.bulkWrite().
66. MongoDB Collection Update methods
db.inventory.deleteMany({ status : "A" })
The following example deletes the first document where status is "D":
db.inventory.deleteOne( { status: "D" } )
68. Data Modelling One to One
Normalized
{
_id: "joe",
name: "Joe Bookreader"
}
{
patron_id: "joe",
street: "123 Fake Street",
city: "Faketon",
state: "MA",
zip: "12345"
}
69. Data Modelling One to One
Denormalized (Better Option)
{
_id: "joe",
name: "Joe Bookreader",
address: {
street: "123 Fake Street",
city: "Faketon",
state: "MA",
zip: "12345"
}
}
70. Data Modelling One to Many
{
_id: "joe",
name: "Joe Bookreader"
}
{
patron_id: "joe",
street: "123 Fake Street",
city: "Faketon",
state: "MA",
zip: "12345"
}
{
patron_id: "joe",
street: "1 Some Other Street",
city: "Boston",
state: "MA",
zip: "12345"
}
71. Data Modelling One to Many Denormalized
{
_id: "joe",
name: "Joe Bookreader",
addresses: [
{
street: "123 Fake Street",
city: "Faketon",
state: "MA",
zip: "12345"
},
{
street: "1 Some Other Street",
city: "Boston",
state: "MA",
zip: "12345"
}
]
}
Save the number of queries Advantage
73. Data Modelling
The Process of Data Modeling in MongoDB
Data modeling comes with improved database performance, but at the expense of some considerations
which include:
● Data retrieval patterns
● Balancing needs of the application such as: queries, updates and data processing
● Performance features of the chosen database engine
● The Inherent structure of the data itself
74. Data Modelling
MongoDB Document Structure
Documents in MongoDB play a major role in the decision making over which technique to apply for a
given set of data. There are generally two relationships between data, which are:
● Embedded Data
● Reference Data
75. Data Modelling
● In this case, related data is stored within a single document either as a field value or an array
within the document itself. The main advantage of this approach is that data is denormalized
and therefore provides an opportunity for manipulating the related data in a single database
operation. Consequently, this improves the rate at which CRUD operations are carried out,
hence fewer queries are required. Let’s consider an example of a document below
77. Data Modelling
Strengths of Embedding
1. Increased data access speed: For an improved rate of access to data, embedding is the best
option since a single query operation can manipulate data within the specified document with
just a single database look-up.
2. Reduced data inconsistency: During operation, if something goes wrong (for example a network
disconnection or power failure) only a few numbers of documents may be affected since the
criteria often select a single document.
3. Reduced CRUD operations. This is to say, the read operations will actually outnumber the
writes. Besides, it is possible to update related data in a single atomic write operation. I.e for the
above data, we can update the phone number and also increase the distance with this single o
79. Data Modelling
Weaknesses of Embedding
1. Restricted document size. All documents in MongoDB are constrained to the BSON size of 16
megabytes. Therefore, overall document size together with embedded data should not surpass
this limit. Otherwise, for some storage engines such as MMAPv1, data may outgrow and result
in data fragmentation as a result of degraded write performance.
2. Data duplication: multiple copies of the same data make it harder to query the replicated data
and it may take longer to filter embedded documents, hence outdo the core advantage of
embedding.
80. Data Modelling
Dot Notation
The dot notation is the identifying feature for embedded data in the programming part. It is used to
access elements of an embedded field or an array. In the sample data above, we can return information
of the student whose location is “Embassy” with this query using the dot notation.
db.users.find({'Settings.location': 'Embassy'})
81. Data Modelling
Flexible Schema
A flexible schema in MongoDB defines that the documents not necessarily need to have the same fields
or data type, for a field can differ across documents within a collection. The core advantage with this
concept is that one can add new fields, remove existing ones or change the field values to a new type
and hence update the document into a new structure.
For example we can have these 2 documents in the same collection
84. Data Modelling
However, validation can also be applied to already existing documents.
There are 3 levels of validation:
1. Strict: this is the default MongoDB validation level and it applies validation rules to all inserts and
updates.
2. Moderate: The validation rules are applied during inserts, updates and to already existing
documents that fulfill the validation criteria only.
3. Off: this level sets the validation rules for a given schema to null hence no validation will be done
to the documents.
86. Data Modelling
If we apply the moderate validation level using:
db.runCommand( {
collMod: "test",
validator: { $jsonSchema: {
bsonType: "object",
required: [ "phone", "name" ],
properties: {
phone: {
bsonType: "string",
description: "must be a string and is required"
},
name: {
bsonType: "string",
description: "must be a string and is required"
}
}
} },
validationLevel: "moderate"
87. Data Modelling
Schema Validation Actions
After doing validation on documents, there may be some that may violate the validation rules. There is
always a need to provide an action when this happens.
MongoDB provides two actions that can be issued to the documents that fail the validation rules:
1. Error: this is the default MongoDB action, which rejects any insert or update in case it violates
the validation criteria.
2. Warn: This action will record the violation in the MongoDB log, but allows the insert or update
operation to be completed. For example:
88. Data Modelling
db.createCollection("students", {
validator: {$jsonSchema: {
bsonType: "object",
required: [ "name", "gpa" ],
properties: {
name: {
bsonType: "string",
description: "must be a string and is required"
},
gpa: {
bsonType: [ "double" ],
minimum: 0,
description: "must be a double and is required"
}
}
},
validationAction: “warn”
})
89. Data Modelling
db.students.insert( { name: "Amanda", status: "Updated" } )
The gpa is missing regardless of the fact that it is a required field in the schema design, but since the
validation action has been set to warn, the document will be saved and an error message will be
recorded in the MongoDB log.
Validation action can not be set on admin,local and config databases.
90. Data Modelling
db.createCollection( "contacts5", {
validator: { $jsonSchema: {
bsonType: "object",
required: [ "phone" ],
properties: {
phone: {
bsonType: "string",
description: "must be a string and is required"
},
email: {
bsonType : "string",
pattern : "@mongodb.com$",
description: "must be a string and match the regular expression pattern"
},
status: {
enum: [ "Unknown", "Incomplete" ],
description: "can only be one of the enum values"
}
}
} },
validationAction: "error"
} )
94. Data Modelling DeNormalization
The advantage of this is that you need one less query to get the information. The
downside is that it takes up more space and is more difficult to keep in sync. For
example, we decide that the light style should be renamed day. We would have to
update every single document where the user.accountsPref.style was light.
97. Data Modelling Join
SQL equivalent is:
SELECT *, stockdata
FROM orders
WHERE stockdata IN (SELECT warehouse, instock
FROM warehouses
WHERE stock_item= orders.item
AND instock >= orders.ordered );
101. Data Modelling Nested Pipeline
Above is equivalent to below SQL statement
SELECT *, stockdata
FROM orders
WHERE stockdata IN (SELECT warehouse, instock
FROM warehouses
WHERE stock_item= orders.item
AND instock >= orders.ordered );
104. Data Modelling Tree Structures Parent reference
db.categories.insert( { _id: "MongoDB", parent: "Databases" } )
db.categories.insert( { _id: "dbm", parent: "Databases" } )
db.categories.insert( { _id: "Databases", parent: "Programming" } )
db.categories.insert( { _id: "Languages", parent: "Programming" } )
db.categories.insert( { _id: "Programming", parent: "Books" } )
db.categories.insert( { _id: "Books", parent: null } )
db.categories.findOne( { _id: "MongoDB" } ).parent
You can create an index on the field parent to enable fast search by the parent node:
db.categories.createIndex( { parent: 1 } )
105. Data Modelling Tree Structures
You can query by the parent field to find its immediate children nodes:
db.categories.find( { parent: "Databases" } )
107. Data Modelling Tree Structures child as reference
db.categories.findOne( { _id: "Databases" } ).children
Create children as index for fast search
db.categories.createIndex( { children: 1 } )
You can query for a node in the children field to find its parent node as well as its siblings:
db.categories.find( { children: "MongoDB" } )
109. Data Modelling Tree Structures ancestors as reference
db.categories.findOne( { _id: "MongoDB" } ).ancestors
db.categories5.createIndex( { ancestors: 1 } )
You can query by the field ancestors to find all its descendants:
db.categories5.find( { ancestors: "Programming" } )
114. Data Modelling Operational Strategies
{ last_name : "Smith", best_score: 3.9 }
Change it to :
{ lname : "Smith", score : 3.9 }
And save 9 bytes
115. Data Modelling Operational Strategies
Data Lifecycle Management
Data modeling decisions should take data lifecycle management into consideration.
The Time to Live or TTL feature of collections expires documents after a period of time. Consider using the TTL
feature if your application requires some data to persist in the database for a limited period of time.
Additionally, if your application only uses recently inserted documents, consider Capped Collections. Capped
collections provide first-in-first-out (FIFO) management of inserted documents and efficiently support operations that
insert and read documents based on insertion order.
116. MongoDB backup Strategies
Backup Strategy #1: mongodump
Backup Strategy #2: Copying the Underlying Files
For example, Linux LVM quickly and efficiently creates a consistent snapshot of the file system that can be copied for backup and restore
purposes. To ensure that the snapshot is logically consistent, you must have journaling enabled within MongoDB.
Backup Strategy #3: MongoDB Management Service (MMS)
MongoDB Management Service provides continuous, online backup for MongoDB as a fully managed service. You install the Backup Agent
in your environment, which conducts an initial sync to MongoDB’s secure and redundant datacenters. After the initial sync, MMS streams
encrypted and compressed MongoDB oplog data to MMS so that you have a continuous backup.
117. MongoDB Monitoring
When it comes to MongoDB monitoring, some of the important metrics to monitor are:
● Performance stats
● Utilization of resources (CPU usage, available memory and Network usage)
● Assert stats
● Replication stats
● Saturation of resources
● Throughput operations
Applications Manager MongoDB monitoring service supports all versions of MongoDB up to version 4.0.2.
121. MongoDB Relationship
This approach maintains all the related data in a single document, which makes it easy to retrieve and maintain. The whole
document can be retrieved in a single query such as −
>db.users.findOne({"name":"Tom Benzamin"},{"address":1})
123. MongoDB Scalability
The primary node receives all write operations. A replica set can have only one primary capable of confirming writes
with { w: "majority" } write concern; although in some circumstances, another mongod instance may
transiently believe itself to also be primary. [1] The primary records all changes to its data sets in its operation log, i.e.
oplog. For more information on primary node operation, see Replica Set Primary.
The secondaries replicate the primary’s oplog and apply the operations to their data sets such that the secondaries’
data sets reflect the primary’s data set. If the primary is unavailable, an eligible secondary will hold an election to elect
itself the new primary. For more information on secondary members, see Replica Set Secondary Members.
126. MongoDB Scalability
An arbiter will always be an arbiter whereas a primary may step down and become a secondary and a secondarymay
become the primary during an election.
127. MongoDB Scalability Failover
When a primary does not communicate with the other members of the set for more than the configured
electionTimeoutMillis period (10 seconds by default), an eligible secondary calls for an election to nominate
itself as the new primary. The cluster attempts to complete the election of a new primary and resume normal
operations.
128. MongoDB Scalability Failover
Read Preference
By default, clients read from the primary [1]; however, clients can specify a read preference to send read operations to
secondaries.
136. MongoDB Algorithmic Sharding
Algorithmically sharded databases use a sharding function (partition_key) ->
database_id to locate data. A simple sharding function may be “hash(key) %
NUM_DB”.
141. MongoDB Hierarchical Sharding
For example, if the shard key is:
{ a: 1, b: 1, c: 1 }
The mongos program can route queries that include the full shard key or either of the following shard key
prefixes at a specific shard or set of shards:
{ a: 1 }
{ a: 1, b: 1 }
142. MongoDB Shard Collections Distribution
All insertOne() operations target to one shard. Each document in the insertMany() array targets to a single
shard, but there is no guarantee all documents in the array insert into a single shard.
All updateOne(), replaceOne() and deleteOne() operations must include the shard key or _id in the query
document. MongoDB returns an error if these methods are used without the shard key or _id.
Depending on the distribution of data in the cluster and the selectivity of the query, mongos may still perform a
broadcast operation to fulfill these queries
.
143. MongoDB Components of Sharding
The components of a Shard include
1. A Shard – This is the basic thing, and this is nothing but a MongoDB instance which holds the
subset of the data. In production environments, all shards need to be part of replica sets.
2. Config server – This is a mongodb instance which holds metadata about the cluster, basically
information about the various mongodb instances which will hold the shard data.
3. A Router – This is a mongodb instance which basically is responsible to re-directing the
commands send by the client to the right servers.
144. MongoDB Components of Sharding & Setup
Setup Requirements
We require the following servers for Mongodb Sharding setup:
Query server – Server A
Config server & Shards / Replica Set – Server B
Config server & Shards / Replica Set – Server C
Config server & Shards / Replica Set – Server D
145. MongoDB Components of Sharding
Run in All servers
$ apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10
$ echo "deb http://repo.mongodb.org/apt/ubuntu "$(lsb_release -sc)"/mongodb-org/3.2 multiverse" | sudo tee
/etc/apt/sources.list.d/mongodb-org-3.2.list
$ sudo apt-get update
$ sudo apt-get install -y mongodb-org
Skip Above steps if mongodb is already installed and apt is setup.
150. MongoDb Sharding Query Router setup
MongoDB Sharding Query Routers setup in Server A
$sudo service mongodb stop
In order to start the query router service:
# Using this same (following) command you can start multiple query servers
$ mongos --configdb config0.host.com:27019,config1.host.com:27019,config2.host.com:27019
151. Add Shard Server in Query Router
In Server A
$ mongo --host query0.example.com --port 27017
mongo> sh.addShard( "rep_set_name/rep_set_member:27017" )
$ mongo --port 27000
mongo> rs.status()
152. Enabling Sharding for a Database and Collection
$ mongo --host query0.example.com --port 27017
mongo> sh.stopBalancer()
#Enabling Sharding to the Database
mongo> use current_DB
mongo> sh.enableSharding("#{current_DB}")
Enabling Sharding to the Collection
mongo> use current_DB
mongo> sh.shardCollection( "current_DB.test_collection", { "_id" : "shard key" } )
153. MongoDB Indexes
What Problem do Indexes solve?
Slow Queries
Without indexes, MongoDB must perform a collection scan, i.e. scan every
document in a collection, to select those documents that match the query
statement. If an appropriate index exists for a query, MongoDB can use the
index to limit the number of documents it must inspect.
155. MongoDB Indexes
db.collection.createIndex( <key and index type specification>, <options>
)
db.collection.createIndex( { name: -1 } )
You can create indexes with a custom name, such as one that is more human-readable than the default. For example, consider
an application that frequently queries the products collection to populate data on existing inventory. The following
createIndex() method creates an index on item and quantity named query forinventory
db.products.createIndex(
{ item: 1, quantity: -1 } ,
{ name: "query for inventory" }
)
156. MongoDB Indexes Types
Single Field
In addition to the MongoDB-defined _id index, MongoDB supports the creation of user-defined
ascending/descending indexes on a single field of a document.
157. MongoDB Indexes Types
Compound Index
MongoDB also supports user-defined indexes on multiple fields, i.e. compound indexes.
The order of fields listed in a compound index has significance. For instance, if a compound index consists of
{userid: 1, score: -1 }, the index sorts first by userid and then, within each userid value, sorts by score.
158. MongoDB Indexes Types
Multikey Index
MongoDB uses multikey indexes to index the content stored in arrays. If you index a field that holds an array value,
MongoDB creates separate index entries for every element of the array. These multikey indexes allow queries to
select documents that contain arrays by matching on element or elements of the arrays. MongoDB automatically
determines whether to create a multikey index if the indexed field contains an array value; you do not need to
explicitly specify the multikey type.
160. MongoDB Indexes Types
Geospatial Index
To support efficient queries of geospatial coordinate data, MongoDB provides two special indexes: 2d indexesthat
uses planar geometry when returning results and 2dsphere indexes that use spherical geometry to return results.
See 2d Index Internals for a high level introduction to geospatial indexes.
161. MongoDB Indexes Types
Text Indexes
MongoDB provides a text index type that supports searching for string content in a collection. These text indexes do
not store language-specific stop words (e.g. “the”, “a”, “or”) and stem the words in a collection to only store root
words.
162. MongoDB Indexes Types
Hashed Indexes
To support hash based sharding, MongoDB provides a hashed index type, which indexes the hash of the value of a
field. These indexes have a more random distribution of values along their range, but only support equality matches
and cannot support range-based queries.
163. MongoDB Indexes Properties
Unique Indexes
The unique property for an index causes MongoDB to reject duplicate values for the indexed field. Other than the
unique constraint, unique indexes are functionally interchangeable with other MongoDB indexes.
Partial Indexes
New in version 3.2.
Partial indexes only index the documents in a collection that meet a specified filter expression. By indexing a subset
of the documents in a collection, partial indexes have lower storage requirements and reduced performance costs for
index creation and maintenance.
Partial indexes offer a superset of the functionality of sparse indexes and should be preferred over sparse indexes
164. MongoDB Indexes Properties
Sparse Indexes
The sparse property of an index ensures that the index only contain entries for documents that have the indexed field. The index
skips documents that do not have the indexed field.
You can combine the sparse index option with the unique index option to prevent inserting documents that have duplicate values
for the indexed field(s) and skip indexing documents that lack the indexed field(s).
TTL Indexes
TTL indexes are special indexes that MongoDB can use to automatically remove documents from a collection after a
certain amount of time. This is ideal for certain types of information like machine generated event data, logs, and
session information that only need to persist in a database for a finite amount of time.
165. MongoDB Indexes Uses
Indexes can improve the efficiency of read operations. The Analyze Query Performance tutorial provides an example
of the execution statistics of a query with and without an index.
For information on how MongoDB chooses an index to use, see query optimizer.
166. MongoDB Indexes & Collation
To use an index for string comparisons, an operation must also specify the same collation. That is, an index with a
collation cannot support an operation that performs string comparisons on the indexed fields if the operation specifies
a different collation.
For example, the collection myColl has an index on a string field category with the collation locale "fr".
db.myColl.createIndex( { category: 1 }, { collation: { locale: "fr" } } )
db.myColl.find( { category: "cafe" } ).collation( { locale: "fr" } )
db.myColl.find( { category: "cafe" } )
167. MongoDB Indexes & Collation
For example, the collection myColl has a compound index on the numeric fields score and price and the string
field category; the index is created with the collation locale "fr" for string comparisons:
db.myColl.createIndex(
{ score: 1, price: 1, category: 1 },
{ collation: { locale: "fr" } } )
The following operations, which use "simple" binary collation for string comparisons, can use the index:
db.myColl.find( { score: 5 } ).sort( { price: 1 } )
db.myColl.find( { score: 5, price: { $gt: NumberDecimal( "10" ) } } ).sort( { price: 1 } )
168. MongoDB Indexes & Collation
The following operation, which uses "simple" binary collation for string comparisons on the indexed categoryfield,
can use the index to fulfill only the score: 5 portion of the query:
db.myColl.find( { score: 5, category: "cafe" } )
170. MongoDB Operations that support collation
Operations that Support Collation
All reading, updating, and deleting methods support collation. Some examples are listed below.
find() and sort()
Individual queries can specify a collation to use when matching and sorting results. The following query and sort
operation uses a German collation with the locale parameter set to de.
175. MongoDB Aggregation Pipeline
db.orders.aggregate([
{ $match: { status: "A" } },
{ $group: { _id: "$cust_id", total: { $sum: "$amount" } } }
])
*This works as Unix Pipes
First Stage: The $match stage filters the documents by the status field and passes to the next stage those
documents that have status equal to "A".
Second Stage: The $group stage groups the documents by the cust_id field to calculate the sum of the amount
for each unique cust_id.
176. MongoDB Aggregation Pipeline
The MongoDB aggregation pipeline consists of stages. Each stage transforms the documents as they pass through
the pipeline. Pipeline stages do not need to produce one output document for every input document; e.g., some
stages may generate new documents or filter out documents.
Pipeline stages can appear multiple times in the pipeline with the exception of $out, $merge, and $geoNearstages.
For a list of all available stages, see Aggregation Pipeline Stages.
MongoDB provides the db.collection.aggregate() method in the mongo shell and the aggregatecommand
to run the aggregation pipeline.
For example usage of the aggregation pipeline, consider Aggregation with User Preference Data and Aggregation
with the Zip Code Data Set.
178. MongoDB Performance
● Locking Performance
● Number of Connections
● Database Profiling
● Full Time Diagnostic Data Capture
179. MongoDB Perfomance
Number of Connections
In some cases, the number of connections between the applications and the database can overwhelm the ability of
the server to handle requests. The following fields in the serverStatus document can provide insight:
● connections is a container for the following two fields:
○ connections.current the total number of current clients connected to the database instance.
○ connections.available the total number of unused connections available for new clients.
If there are numerous concurrent application requests, the database may have trouble keeping up with demand. If this
is the case, then you will need to increase the capacity of your deployment.
180. MongoDB Perfomance
Database Profiling
The Database Profiler collects detailed information about operations run against a mongod instance. The profiler’s
output can help to identify inefficient queries and operations.
You can enable and configure profiling for individual databases or for all databases on a mongod instance. Profiler
settings affect only a single mongod instance and will not propagate across a replica set or sharded cluster.
See Database Profiler for information on enabling and configuring the profiler.
181. MongoDB Perfomance
The following profiling levels are available:
Level
Description
0 The profiler is off and does not collect any data. This is the default profiler level.
1 The profiler collects data for operations that take longer than the value of slowms.
2 The profiler collects data for all operations.
182. MongoDB Perfomance - DB profiling
Enable and Configure Database Profiling
This section uses the mongo shell helper db.setProfilingLevel() helper to enable profiling. For instructions using the
driver, see your driver documentation.
When you enable profiling for a mongod instance, you set the profiling level to a value greater than 0. The profiler records data
in the system.profile collection. MongoDB creates the system.profile collection in a database after you enable profiling
for that database.
To enable profiling and set the profiling level, pass the profiling level to the db.setProfilingLevel() helper. For example,
to enable profiling for all database operations, consider the following operation in the mongo shell:
184. MongoDB Performance - DB profiling
Specify the Threshold for Slow Operations
By default, the slow operation threshold is 100 milliseconds. To change the slow operation threshold, specify the desired
threshold value in one of the following ways:
● Set the value of slowms using the profile command or db.setProfilingLevel() shell helper method.
● Set the value of --slowms from the command line at startup.
● Set the value of slowOpThresholdMs in a configuration file.
For example, the following code sets the profiling level for the current mongod instance to 1 and sets the slow operation
threshold for the mongod instance to 20 milliseconds:
db.setProfilingLevel(1, { slowms: 20 })
Profiling level of 1 will profile operations slower than the threshold.
185. MongoDB Performance - DB profiling
IMPORTANT
The slow operation threshold applies to all databases in a mongod instance. It is used by both the database
profiler and the diagnostic log and should be set to the highest useful value to avoid performance degradation.
186. MongoDB Performance - DB profiling
Check Profiling Level
To view the profiling level, issue the following from the mongo shell:
db.getProfilingStatus()
To enable profiling for a mongod instance, pass the following options to mongod at startup.
$mongod --profile 1 --slowms 15 --slowOpSampleRate 0.5
187. Example Data profiler Queries
This section displays example queries to the system.profile collection. For an explanation of the
query output, see Database Profiler Output.
To return the most recent 10 log entries in the system.profile collection, run a query similar to the
following:
>db.system.profile.find().limit(10).sort( { ts : -1 } ).pretty()
>db.system.profile.find( { op: { $ne : 'command' } } ).pretty()
188. Example Data profiler Queries
To return operations for a particular collection, run a query similar to the following. This example
returns operations in the mydb database’s test collection:
>db.system.profile.find( { ns : 'mydb.test' } ).pretty()
>db.system.profile.find( { millis : { $gt : 5 } } ).pretty()
189. Example Data profiler Queries based on time
To return operations for a particular collection, run a query similar to the following. This example
returns operations in the mydb database’s test collection.
>db.system.profile.find( { ns : 'mydb.test' } ).pretty()
To return operations slower than 5 milliseconds, run a query similar to the following:
>db.system.profile.find( { millis : { $gt : 5 } } ).pretty()
190. Example Data profiler Queries based on time
To return information from a certain time range, run a query similar to the following:
db.system.profile.find({
ts : {
$gt: new ISODate("2019-11-09T03:00:00Z"),
$lt: new ISODate("2012-11-09T03:40:00Z")
}
}).pretty()
191. Example Data create new profile
For example, to create a new system.profile collections that’s 4000000 bytes, use the following sequence of operations in
the mongo shell:
db.setProfilingLevel(0)
db.system.profile.drop()
db.createCollection( "system.profile", { capped: true, size:4000000 } )
db.setProfilingLevel(1)
195. MongoDB security
Encrypt Communication
Configure MongoDB to use TLS/SSL for all incoming and outgoing connections. Use TLS/SSL to encrypt
communication between mongod and mongos components of a MongoDB deployment as well as between all
applications and MongoDB.
Starting in version 4.0, MongoDB uses the native TLS/SSL OS libraries:
196. MongoDB security
Start MongoDB without access control.
mongod --port 27017 --dbpath /var/lib/mongodb
Connect to Instance
mongo --port 27017
198. MongoDB security
Re-start the MongoDB instance with access control.
db.adminCommand( { shutdown: 1 } )
From the terminal, re-start the mongod instance with the --auth command line option or, if using a configuration
file, the security.authorization setting.
mongod --auth --port 27017 --dbpath /var/lib/mongodb
199. MongoDB security
Start a mongo shell with the -u <username>, -p, and the --authenticationDatabase<database>
command line options:
$mongo --host 192.168.1.103 --port 27017 --authenticationDatabase "admin" -u "myUserAdmin" -p
“qwerty”
204. MongoDB LDAP Authentication
Users that will authenticate to MongoDB using an external authentication mechanism, such as LDAP, must be created
in the $external database, which allows mongos or mongod to consult an external source for authentication.
Changed in version 3.6.3: To use sessions with $external authentication users (i.e. Kerberos, LDAP, x.509 users),
the usernames cannot be greater than 10k bytes.
For LDAP authentication, you must specify a username. You do not need to specify the password, as that is handled by
the LDAP service.
The following operation adds the reporting user with read-only access to the records database.