2. Contents
Intro to MongoDB
Why use it?
Performance analysis
Documents and Collections
Querying
Schema Design
Sharding
Security
Applications
Conclusion
3.
4. History
MongoDB’s name comes from the middle five letters of the
word “humongous”, meaning big data.
MongoDB was created by the founders (Eliot and Dwight) of
DoubleClick.
Development of MongoDB began in October 2007 by 10gen.
In 2009, MongoDB was open sourced as a stand-alone
product with an AGPL license.
In March 2011, from version 1.4, MongoDB has been
considered production ready.
5. What is MongoDB?
Scalable, High-Performance, Open-Source, NoSQL Document
orientated database designed with both scalability and
developer agility in mind. It is written in C++ & built for speed.
Features:
Rich Document based queries for Easy readability.
Full Index Support for High Performance.
Replication and Failover for High Availability.
Auto Sharding for Easy Scalability.
Map / Reduce for Aggregation.
6.
7. Why use MongoDB?
SQL was invented in the 70’s to store data.
MongoDB stores documents (or) objects.
Now-a-days, everyone works with objects
(Python/Ruby/Java/etc.).
And we need Databases to persist our objects. Then
why not store objects directly ?
Embedded documents and arrays reduce need for
joins. No Joins and No-multi document transactions.
15. Collection
Schema-less(or more accurately, "dynamic schema“)
Contains Documents.
Indexable by one/more keys.
Created on-the-fly when referenced for the first time.
Capped Collections: Fixed size, older records get dropped
after reaching the limit.
16. Document
Stored in a Collection.
Can have _id key – works like Primary keys in MySQL.
Supported Relationships – Embedded (or) References.
.
Document storage in BSON (Binary form of JSON)via
GridFS (i.e. stores images, videos, anything...).
17. Embedded Objects
Documents can embed other documents
For example:
{
name: 'Brad Majors',
address:
{
street: 'Oak Terrace',
city: 'Denton'
}
}
18.
19. Querying
Query Expression Objects:
MongoDB supports a number of query objects for fetching data.
Simple query:
db.users.find({})
More selective:
db.users.find({'last_name': 'Smith'})
Query Options:
Field Selection:
// retrieve ssn field for documents where last_name == 'Smith':
db.users.find({last_name: 'Smith'}, {'ssn': 1});
// retrieve all fields *except* the thumbnail field, for all
documents: db.users.find({}, {thumbnail:0});
20. Sorting:
db.users.find({}).sort({last_name: 1});
// return all documents and sort by last name in ascending
order
Skip and Limit:
db.users.find().skip(20).limit(10);
//skips the first 20 last names, and limit our result set to 10
db.users.find({}, {}, 10, 20);
// same as above, but less clear
Cursors: Used to iteratively retrieve all the documents returned
by the query.
>var cur = db.example.find();
> cur.forEach( function(x) { print(tojson(x))});
{"n" : 1 , "_id" : "497ce96f395f2f052a494fd4"}
{"n" : 2 , "_id" : "497ce971395f2f052a494fd5"}
23. Query examples
-usingembeddeddocuments&referenceddocuments
Exact match an entire embedded object
db.users.find( {address: {street: 'Oak Terrace',
city: 'Denton'}} )
Dot-notation for a partial match
db.users.find( {"address.city": 'Denton'} )
Allows us to deep, nested queries
db.order.find( { shipping: { carrier: "usps" } }
);
here shipping is an embedded document (object)
24.
25. Schema Design
There is no predefined schema, dynamic schema.
Application creates an ad-hoc schema with the objects it creates
The schema is implicit in the queries
Collections to represent the top-level classes
Less normalization, more embedding
27. MongoDB:
It depicts the Shapes object stored in the form of JSON type
document.
This eliminates the storage space involved in creating columns
and rows.
28. Better Schema Design: Embedding
Collection for posts
Embed comments, author name
post = {
author: 'Michael Arrington',
text: 'This is a pretty awesome post.',
comments: [
'Whatever this post.',
'I agree, lame!'
]
}
29. Schema Design Limitations
No referential integrity
High degree of denormalization means updating
something in many places instead of one
Lack of predefined schema is a double-edged sword
Should have model in the application
Objects within a collection can be completely inconsistent in
their fields
30. MongoDB Admin UI's
Some UI's are available as separate community projects and are
listed below. Some are focused on administration, while some focus
on data viewing.
Tools
MongoExplorer
MongoVUE
PHPMoAdmin
Meclipse
Commercial
Database Master
Data Viewers
mongs
31. MongoVue:
MongoVUE is a .NET GUI for MongoDB. It is elegant and highly
usable GUI interface to work with MongoDB. It helps in
managing web-scale data.
32. Database Master
Database Master from Nucleon Software
Features:
Tree view for dbs and collections
Create/Drop indexes
Server/DB stats
33.
34. Sharding
Data is split up into chunks, each is assigned to a
shard
Shard: Single server or Replica set
Config Servers: Store meta data about chunks
and data location
Mongos: Routes requests in a transparent way
37. Some Cool features
Geo-spatial Indexes for Geo-spatial queries.
$near, $within_distance, Bound queries (circle, box)
GridFS
Stores Large Binary Files.
Map/Reduce
GROUP BY in SQL, map/reduce in MongoDB.
38. Map/Reduce
Data processing . It has some basic aggregation capabilities.
Parallelized for working with large sets of data.
mapReduce takes a map function, a reduce function and an
output directive.
Map Funtion : A master node takes an input. Splits it into
smaller sections. Sends it to the associated nodes.
These nodes may perform the same operation in turn to
send those smaller section of input to other nodes. It
process the problem (taken as input) and sends it back to
the Master Node.
39. Reduce Function:
The master node aggregates those results to find the output.
Then, we can use the mapReduce command against some
hits collection by:
> db.hits.mapReduce(map, reduce,{out: {inline:1}});
We could instead specify {out: 'hit_stats'} and have the results
stored in the hit_stats collections:
> db.hits.mapReduce(map, reduce,{out:'hit_stats'});
> db.hit_stats.find();
Map/Reduce contd...
40.
41. Mongodb Security
Trusted environment is default.
The current version of Mongo supports only basic security.
We can authenticate a username and password in the context
of a particular database.
Once authenticated, a normal user has full read and write
access to the database.
$ ./mongo
> use admin
> db.auth("someAdminUser", password) .
42. Mongodb Security contd..
If there are no admin users, we should first create an
administrator user for the entire db server process. This user is
stored under the special admin database.
One may access the database from the localhost interface
without authenticating.
Thus, from the server running the database configure an
administrative user:
$ ./mongo
> use admin
> db.addUser("theadmin", "anadminpassword")
43. Mongodb Security contd..
Now, let's configure a "regular" user for another database.
> use projectx
> db.addUser("joe", "passwordForJoe")
Finally, let's add a readonly user.
> use projectx
> db.addUser("guest", "passwordForGuest", true)
44. Cool uses
Data Warehouse
Mongo understands JSON natively
Very powerful for analysis
Query a bunch of data from web service
Import into mongo (mongoimport –f filename.json)
Harmonyapp.com
Large rails app for building websites (kind of a CMS)
Hardcore debugging
Spit out large amounts of data
45.
46. Applications
RDBMS replacement for Web Applications.
Semi-structured Content Management.
Real-time Analytics & High-Speed Logging.
Caching and High Scalability.
Web 2.0, Media, SAAS, Gaming
HealthCare, Finance, Telecom, Government
49. Conclusion
MongoDB is fast.
It achieves high performance.
It bridges the gap between traditional RDBMS at one end and
Key-Value pair search engines at the other end.
Document model is simple but powerful.
Advanced features like map/reduce, geospatial indexing etc. are
very compelling.
Very rapid development, open source & surprisingly great
drivers for most languages.
50. Bibliography
The required information is extracted from the following
websites:-
www.wikipedia.org
http://www.mongodb.org/
http://www.10gen.com/
http://www.w3resource.com/mongodb/introduction-
mongodb.php
http://www.jaspersoft.com/bigdata#bigdata-middle-tab-5
http://blog.michaelckennedy.net/2010/04/29/mongodb-vs-sql-
server-2008-performance-showdown/
http://blog.iprofs.nl/2011/11/25/is-mongodb-a-good-
alternative-to-rdbms-databases-like-oracle-and-mysql/