SlideShare uma empresa Scribd logo
1 de 117
Baixar para ler offline
MongoDB
          http://tinyurl.com/97o49y3

                            by toki
About me
● Delta Electronic CTBD Senior Engineer
● Main developer of http://loltw.net
  ○ Website built via MongoDB with daily 600k PV
  ○ Data grow up everyday with auto crawler bots
MongoDB - Simple Introduction
● Document based NOSQL(Not Only SQL)
  database
● Started from 2007 by 10Gen company
● Wrote in C++
● Fast (But takes lots of memory)
● Stores JSON documents in BSON format
● Full index on any document attribute
● Horizontal scalability with auto sharding
● High availability & replica ready
What is database?
● Raw data
  ○ John is a student, he's 12 years old.
● Data
  ○ Student
    ■ name = "John"
    ■ age = 12
● Records
  ○ Student(name="John", age=12)
  ○ Student(name="Alice", age=11)
● Database
  ○ Student Table
  ○ Grades Table
Example of (relational) database

                    Student Grade

                 Grade ID

                 StudentID
       Student
                 Grade
Student ID                                 Grade

Name                                Grade ID

Age                                 Name

Class ID

                         Class

                 Class ID

                 Name
SQL Language - How to find data?
● Find student name is John
  ○ select * from student where name="John"
● Find class name of John
  ○ select s.name, c.name as class_name from student
    s, class c where name="John" and s.class_id=c.
    class_id
Why NOSQL?
● Big data
  ○ Morden data size is too big for single DB server
  ○ Google search engine
● Connectivity
  ○ Facebook like button
● Semi-structure data
  ○ Car equipments database
● High availability
  ○ The basic of cloud service
Common NOSQL DB characteristic
●   Schemaless
●   No join, stores pre-joined/embedded data
●   Horizontal scalability
●   Replica ready - High availability
Common types of NOSQL DB
● Key-Value
  ○ Based on Amazon's Dynamo paper
  ○ Stores K-V pairs
  ○ Example:
    ■ Dynomite
    ■ Voldemort
Common types of NOSQL DB
● Bigtable clones
  ○   Based on Google Bigtable paper
  ○   Column oriented, but handles semi-structured data
  ○   Data keyed by: row, column, time, index
  ○   Example:
      ■ Google Big Table
      ■ HBase
      ■ Cassandra(FB)
Common types of NOSQL DB
● Document base
  ○ Stores multi-level K-V pairs
  ○ Usually use JSON as document format
  ○ Example:
    ■ MongoDB
    ■ CounchDB (Apache)
    ■ Redis
Common types of NOSQL DB
● Graph
  ○ Focus on modeling the structure of data -
    interconnectivity
  ○ Example
     ■ Neo4j
     ■ AllegroGraph
Start using MongoDB - Installation
● From apt-get (debian / ubuntu only)
  ○ sudo apt-get install mongodb
● Using 10-gen mongodb repository
  ○ http://docs.mongodb.org/manual/tutorial/install-
    mongodb-on-debian-or-ubuntu-linux/
● From pre-built binary or source
  ○ http://www.mongodb.org/downloads
● Note:
  32-bit builds limited to around 2GB of data
Manual start your MongoDB
mkdir -p /tmp/mongo
mongod --dbpath /tmp/mongo

or

mongod -f mongodb.conf
Verify your MongoDB installation
$ mongo

MongoDB shell version: 2.2.0
connecting to: test
>_

--------------------------------------------------------
mongo localhost/test2
mongo 127.0.0.1/test
How many database do you have?
show dbs
Elements of MongoDB
● Database
  ○ Collection
    ■ Document
What is JSON
● JavaScript Object Notation
● Elements of JSON      {
  ○ Object: K/V pairs       "key1": "value1",
  ○ Key, String             "key2": 2.0
  ○ Value, could be         "key3": [1, "str", 3.0],
    ■ string                "key4": false,
    ■ bool                  "key5": {
                               "name": "another object",
    ■ number
                            }
    ■ array
                        }
    ■ object
    ■ null
Another sample of JSON
{
    "name": "John",
    "age": 12,
    "grades": {
        "math": 4.0,
        "english": 5.0
    },
    "registered": true,
    "favorite subjects": ["math", "english"]
}
Insert document into MongoDB
s={
  "name": "John",
  "age": 12,
  "grades": {
      "math": 4.0,
      "english": 5.0
  },
  "registered": true,
  "favorite subjects": ["math", "english"]
}
db.students.insert(s);
Verify inserted document
db.students.find()

also try

db.student.insert(s)
show collections
Save document into MongoDB
s.name = "Alice"
s.age = 14
s.grades.math = 2.0

db.students.save(s)
What is _id / ObjectId ?
● _id is the default primary key for indexing
  documents, could be any JSON acceptable
  value.
● By default, MongoDB will auto generate a
  ObjectId as _id
● ObjectId is 12 bytes value of unique
  document _id
● Use ObjectId().getTimestamp() to restore
  the timestamp in ObjectId
   0     1      2       3   4     5       6   7       8    9      10       11

       unix timestamp           machine       process id       Increment
Save document with id into MongoDB
s.name = "Bob"
s.age = 11
s['favorite subjects'] = ["music", "math", "art"]
s.grades.chinese = 3.0
s._id = 1

db.students.save(s)
Save document with existing _id
delete s.registered

db.students.save(s)
How to find documents?
● db.xxxx.find()
  ○ list all documents in collection
● db.xxxx.find(
    find spec, //how document looks like
    find fields, //which parts I wanna see
    ...
  )
● db.xxxx.findOne()
  ○ only returns first document match find spec.
find by id
db.students.find({_id: 1})
db.students.find({_id: ObjectId('xxx....')})
find and filter return fields
db.students.find({_id:   1},   {_id: 1})
db.students.find({_id:   1},   {name: 1})
db.students.find({_id:   1},   {_id: 1, name: 1})
db.students.find({_id:   1},   {_id: 0, name: 1})
find by name - equal or not equal
db.students.find({name: "John"})
db.students.find({name: "Alice"})

db.students.find({name: {$ne: "John"}})
● $ne : not equal
find by name - ignorecase ($regex)
db.students.find({name: "john"})    => X
db.students.find({name: /john/i})   => O

db.students.find({
     name: {
       $regex: "^b",
       $options: "i"
     }
  })
find by range of names - $in, $nin
db.students.find({name: {$in: ["John", "Bob"]}})
db.students.find({name: {$nin: ["John", "Bob"]}})


● $in : in range (array of items)
● $nin : not in range
find by age - $gt, $gte, $lt, $lte
db.students.find({age:   {$gt: 12}})
db.students.find({age:   {$gte: 12}})
db.students.find({age:   {$lt: 12}})
db.students.find({age:   {$lte: 12}})

●   $gt    :   greater than
●   $gte   :   greater than or equal
●   $lt    :   lesser than
●   $lte   :   lesser or equal
find by field existence - $exists
db.students.find({registered: {$exists: true}})
db.students.find({registered: {$exists: false}})
find by field type - $type
db.students.find({_id: {$type: 7}})
db.students.find({_id: {$type: 1}})
  1    Double           11    Regular expression

  2    String           13    JavaScript code

  3    Object           14    Symbol

  4    Array            15    JavaScript code with scope

  5    Binary Data      16    32 bit integer

  7    Object id        17    Timestamp

  8    Boolean          18    64 bit integer

  9    Date             255   Min key

  10   Null             127   Max key
find in multi-level fields
db.students.find({"grades.math": {$gt: 2.0}})
db.students.find({"grades.math": {$gte: 2.0}})
find by remainder - $mod
db.students.find({age: {$mod: [10, 2]}})
db.students.find({age: {$mod: [10, 3]}})
find in array - $size
db.students.find(
  {'favorite subjects': {$size: 2}}
)
db.students.find(
  {'favorite subjects': {$size: 3}}
)
find in array - $all
db.students.find({'favorite subjects': {
      $all: ["music", "math", "art"]
  }})
db.students.find({'favorite subjects': {
      $all: ["english", "math"]
  }})
find in array - find value in array
db.students.find(
  {"favorite subjects": "art"}
)

db.students.find(
  {"favorite subjects": "math"}
)
find with bool operators - $and, $or
db.students.find({$or: [
    {age: {$lt: 12}},
    {age: {$gt: 12}}
]})

db.students.find({$and: [
    {age: {$lt: 12}},
    {age: {$gte: 11}}
]})
find with bool operators - $and, $or
db.students.find({$and: [
    {age: {$lt: 12}},
    {age: {$gte: 11}}
]})

equals to

db.student.find({age: {$lt:12, $gte: 11}}
find with bool operators - $not
$not could only be used with other find filter

X db.students.find({registered: {$not: false}})
O db.students.find({registered: {$ne: false}})

O db.students.find({age: {$not: {$gte: 12}}})
find with JavaScript- $where
db.students.find({$where: "this.age > 12"})

db.students.find({$where:
   "this.grades.chinese"
})
find cursor functions
● count
  db.students.find().count()
● limit
  db.students.find().limit(1)
● skip
  db.students.find().skip(1)
● sort
  db.students.find().sort({age: -1})
  db.students.find().sort({age: 1})
combine find cursor functions
db.students.find().skip(1).limit(1)
db.students.find().skip(1).sort({age: -1})
db.students.find().skip(1).limit(1).sort({age:
-1})
more cursor functions
● snapshot
  ensure cursor returns
  ○ no duplicates
  ○ misses no object
  ○ returns all matching objects that were present at
    the beginning and the end of the query.
  ○ usually for export/dump usage
more cursor functions
● batchSize
  tell MongoDB how many documents should
  be sent to client at once

● explain
  for performance profiling

● hint
  tell MongoDB which index should be used
  for querying/sorting
list current running operations
● list operations
  db.currentOP()

● cancel operations
  db.killOP()
MongoDB index - when to use index?
● while doing complicate find
● while sorting lots of data
MongoDB index - sort() example
for (i=0; i<1000000; i++){
    db.many.save({value: i});
}

db.many.find().sort({value: -1})

error: {
    "$err" : "too much data for sort() with no index. add an index or specify
a smaller limit",
    "code" : 10128
}
MongoDB index - how to build index
db.many.ensureIndex({value: 1})

● Index options
  ○   background
  ○   unique
  ○   dropDups
  ○   sparse
MongoDB index - index commands
● list index
  db.many.getIndexes()

● drop index
  db.many.dropIndex({value: 1})
  db.many.dropIndexes() <-- DANGER!
MongoDB Index - find() example
db.many.dropIndex({value: 1})
db.many.find({value: 5555}).explain()

db.many.ensureIndex({value: 1})
db.many.find({value: 5555}).explain()
MongoDB Index - Compound Index
db.xxx.ensureIndex({a:1, b:-1, c:1})

query/sort with fields
   ● a
   ● a, b
   ● a, b, c
will be accelerated by this index
Remove/Drop data from MongoDB
● Remove
  db.many.remove({value: 5555})
  db.many.find({value: 5555})
  db.many.remove()
● Drop
  db.many.drop()
● Drop database
  db.dropDatabase() EXTREMELY DANGER!!!
How to update data in MongoDB
Easiest way:

s = db.students.findOne({_id: 1})
s.registered = true
db.students.save(s)
In place update - update()
update( {find spec},
        {update spec},
        upsert=false)

db.students.update(
  {_id: 1},
  {$set: {registered: false}}
)
Update a non-exist document
db.students.update(
  {_id: 2},
  {name: 'Mary', age: 9},
  true
)
db.students.update(
  {_id: 2},
  {$set: {name: 'Mary', age: 9}},
  true
)
set / unset field value
db.students.update({_id: 1},
  {$set: {"age": 15}})

db.students.update({_id: 1},
  {$set: {registered:
      {2012: false, 2011:true}
  }})
db.students.update({_id: 1},
  {$unset: {registered: 1}})
increase/decrease value
db.students.update({_id: 1}, {
   $inc: {
      "grades.math": 1.1,
      "grades.english": -1.5,
      "grades.history": 3.0
   }
})
push value(s) into array
db.students.update({_id: 1},{
   $push: {tags: "lazy"}
})

db.students.update({_id: 1},{
   $pushAll: {tags: ["smart", "cute"]}
})
add only not exists value to array
db.students.update({_id: 1},{
   $push: {tags: "lazy"}
})
db.students.update({_id: 1},{
   $addToSet:{tags: "lazy"}
})
db.students.update({_id: 1},{
   $addToSet:{tags: {$each: ["tall", "thin"]}}
})
remove value from array
db.students.update({_id: 1},{
   $pull: {tags: "lazy"}
})
db.students.update({_id: 1},{
   $pull: {tags: {$ne: "smart"}}
})
db.students.update({_id: 1},{
   $pullAll: {tags: ["lazy", "smart"]}
})
pop value from array
a = []; for(i=0;i<20;i++){a.push(i);}
db.test.save({_id:1, value: a})

db.test.update({_id: 1}, {
   $pop: {value: 1}
})
db.test.update({_id: 1}, {
   $pop: {value: -1}
})
rename field
db.test.update({_id: 1}, {
   $rename: {value: "values"}
})
Practice: add comments to student
Add a field into students ({_id: 1}):
● field name: comments
● field type: array of dictionary
● field content:
   ○ {
         by: author name, string
         text: content of comment, string
    }
● add at least 3 comments to this field
Example answer to practice
db.students.update({_id: 1}, {
$addToSet: { comments: {$each: [
    {by: "teacher01", text: "text 01"},
    {by: "teacher02", text: "text 02"},
    {by: "teacher03", text: "text 03"},
]}}
})
The $ position operator (for array)
db.students.update({
      _id: 1,
      "comments.by": "teacher02"
   }, {
      $inc: {"comments.$.vote": 1}
})
Atomically update - findAndModify
● Atomically update SINGLE DOCUMENT and
  return it
● By default, returned document won't
  contain the modification made in
  findAndModify command.
findAndModify parameters
db.xxx.findAndModify({
query: filter to query
sort: how to sort and select 1st document in query results
remove: set true if you want to remove it
update: update content
new: set true if you want to get the modified object
fields: which fields to fetch
upsert: create object if not exists
})
GridFS
●   MongoDB has 32MB document size limit
●   For storing large binary objects in MongoDB
●   GridFS is kind of spec, not implementation
●   Implementation is done by MongoDB drivers
●   Current supported drivers:
    ○   PHP
    ○   Java
    ○   Python
    ○   Ruby
    ○   Perl
GridFS - command line tools
● List
  mongofiles list
● Put
  mongofiles put xxx.txt
● Get
  mongofiles get xxx.txt
MongoDB config - basic
● dbpath
  ○ Which folder to put MongoDB database files
  ○ MongoDB must have write permission to this folder
● logpath, logappend
  ○ logpath = log filename
  ○ MongoDB must have write permission to log file
● bind_ip
  ○ IP(s) MongoDB will bind with, by default is all
  ○ User comma to separate more than 1 IP
● port
  ○ Port number MongoDB will use
  ○ Default port = 27017
Small tip - rotate MongoDB log
db.getMongo().getDB("admin").runCommand
("logRotate")
MongoDB config - journal
● journal
  ○ Set journal on/off
  ○ Usually you should keep this on
MongoDB config - http interface
● nohttpinterface
  ○ Default listen on http://localhost:28017
  ○ Shows statistic info with http interface
● rest
  ○ Used with httpinterface option enabled only
  ○ Example:
    http://localhost:28017/test/students/
    http://localhost:28017/test/students/?
    filter_name=John
MongoDB config - authentication
● auth
  ○ By default, MongoDB runs with no authentication
  ○ If no admin account is created, you could login with
    no authentication through local mongo shell and
    start managing user accounts.
MongoDB account management
● Add admin user
  > mongo localhost/admin
  db.addUser("testadmin", "1234")
● Authenticated as admin user
  use admin
  db.auth("testadmin", "1234")
MongoDB account management
● Add user to test database
  use test
  db.addUser("testrw", "1234")
● Add read only user to test database
  db.addUser("testro", "1234", true)
● List users
  db.system.users.find()
● Remove user
  db.removeUser("testro")
MongoDB config - authentication
● keyFile
  ○ At least 6 characters and size smaller than 1KB
  ○ Used only for replica/sharding servers
  ○ Every replica/sharding server should use the same
    key file for communication
  ○ On U*ix system, file permission to key file for
    group/everyone must be none, or MongoDB will
    refuse to start
MongoDB configuration - Replica Set
● replSet
  ○ Indicate the replica set name
  ○ All MongoDB in same replica set should use the
    same name
  ○ Limitation
     ■ Maximum 12 nodes in a single replica set
     ■ Maximum 7 nodes can vote
  ○ MongoDB replica set is Eventually consistent
How's MongoDB replica set working?
● Each a replica set has single primary
  (master) node and multiple slave nodes
● Data will only be wrote to primary node
  then will be synced to other slave nodes.
● Use getLastError() for confirming previous
  write operation is committed to whole
  replica set, otherwise the write operation
  may be rolled back if primary node is down
  before sync.
How's MongoDB replica set working?
● Once primary node is down, the whole
  replica set will be marked as fail and can't
  do any operation on it until the other nodes
  vote and elect a new primary node.
● During failover, any write operation not
  committed to whole replica set will be
  rolled back
Simple replica set configuration
mkdir -p /tmp/db01
mkdir -p /tmp/db02
mkdir -p /tmp/db03

mongod --replSet test --port 29001 --dbpath /tmp/db01
mongod --replSet test --port 29002 --dbpath /tmp/db02
mongod --replSet test --port 29003 --dbpath /tmp/db03
Simple replica set configuration
mongo localhost:29001
Another way to config replica set
rs.initiate()
rs.add("localhost:29001")
rs.add("localhost:29002")
rs.add("localhost:29003")
Extra options for setting replica set
● arbiterOnly
  ○ Arbiter nodes don't receive data, can't become
    primary node but can vote.
● priority
  ○ Node with priority 0 will never be elected as
    primary node.
  ○ Higher priority nodes will be preferred as primary
  ○ If you want to force some node become primary
    node, do not update node's vote result, update
    node's priority value and reconfig replica set.
● buildIndexes
  ○ Can only be set to false on nodes with priority 0
  ○ Use false for backup only nodes
Extra options for setting replica set
● hidden
  ○ Nodes marked with hidden option will not be
    exposed to MongoDB clients.
  ○ Nodes marked with hidden option will not receive
    queries.
  ○ Only use this option for nodes with usage like
    reporting, integration, backup, etc.
● slaveDelay
  ○ How many seconds slave nodes could fall behind to
    primary nodes
  ○ Can only be set on nodes with priority 0
  ○ Used for preventing some human errors
Extra options for setting replica set
● vote
  If set to 1, this node can vote, else not.
Change primary node at runtime
config = rs.conf()
config.members[1].priority = 2
rs.reconfig(config)
What is sharding?

  Name    Value     A    value

  Alice   value     to   value

  Amy     value     F    value

  Bob     value
                    G    value
    :     value
                    to   value
    :     value
                    N    value
    :     value

    :     value
                    O    value
  Yoko    value
                    to   value
  Zeus    value
                    Z    value
MongoDB sharding architecture
Elements of MongoDB sharding
cluster
● Config Server
  Storing sharding cluster metadata
● mongos Router
  Routing database operations to correct
  shard server
● Shard Server
  Hold real user data
Sharding config - config server
● Config server is a MongoDB instance runs
  with --configsrv option
● Config servers will automatically synced by
  mongos process, so DO NOT run them with
  --replSet option
● Synchronous replication protocol is
  optimized for three machines.
Sharding config - mongos Router
● Use mongos (not mongod) for starting a
  mongos router
● mongos routes database operations to
  correct shard servers
● Exmaple command for starting mongos
  mongos --configdb db01, db02, db03
● With --chunkSize option, you could specify
  a smaller sharding chunk if you're just
  testing.
Sharding config - shard server
● Shard server is a MongoDB instance runs
  with --shardsvr option
● Shard server don't need to know where
  config server / mongos route is
Example script for building MongoDB
shard cluster
mkdir   -p   /tmp/s00
mkdir   -p   /tmp/s01
mkdir   -p   /tmp/s02
mkdir   -p   /tmp/s03

mongod --configsvr --port 29000 --dbpath /tmp/s00
mongos --configdb localhost:29000 --chunkSize 1 --port
28000
mongod --shardsvr --port 29001 --dbpath /tmp/s01
mongod --shardsvr --port 29002 --dbpath /tmp/s02
mongod --shardsvr --port 29003 --dbpath /tmp/s03
Sharding config - add shard server
mongo localhost:28000/admin

db.runCommand({addshard: "localhost:29001"})
db.runCommand({addshard: "localhost:29002"})
db.runCommand({addshard: "localhost:29003"})


db.printShardingStatus()
db.runCommand( { enablesharding : "test" } )
db.runCommand( {shardcollection: "test.shardtest",
key: {_id: 1}, unique: true})
Let us insert some documents
use test

for (i=0; i<1000000; i++) {
   db.shardtest.insert({value: i});
}
Remove 1 shard & see what happens
use admin
db.runCommand({removeshard: "shard0002"})

Let's add it back
db.runCommand({addshard: "localhost:
29003"})
Pick your sharding key wisely
● Sharding key can not be changed after
  sharding enabled
● For updating any document in a sharding
  cluster, sharding key MUST BE INCLUDED as
  find spec
EX:
  sharding key= {name: 1, class: 1}
  db.xxx.update({name: "xxxx", class: "ooo},{
  ..... update spec
  })
Pick your sharding key wisely
● Sharding key will strongly affect your data
  distribution model
EX:
  sharding by ObjectId
  shard001 => data saved 2 months ago
  shard002 => data saved 1 months ago
  shard003 => data saved recently
Other sharding key examples
EX:
  sharding by Username
  shard001 => Username starts with a to k
  shard002 => Username starts with l to r
  shard003 => Username starts with s to z
EX:
  sharding by md5
  completely random distribution
What is Mapreduce?
● Map then Reduce
● Map is the procedure to call a function for
  emitting keys & values sending to reduce
  function
● Reduce is the procedure to call a function
  for reducing the emitted keys & values sent
  via map function into single reduced result.
● Example: map students grades and reduce
  into total students grades.
How to call mapreduce in MongoDB
db.xxx.mapreduce(
   map function,
   reduce function,{
   out: output option,
   query: query filter, optional,
   sort: sort filter, optional,
   finalize: finalize function,
   .... etc
})
Let's generate some data
for (i=0; i<10000; i++){
   db.grades.insert({
       grades: {
          math: Math.random() * 100 % 100,
          art: Math.random() * 100 % 100,
          music: Math.random() * 100 % 100
       }
   });
}
Prepare Map function
function map(){
   for (k in this.grades){
       emit(k, {total: 1,
       pass: 1 ? this.grades[k] >= 60.0 : 0,
       fail: 1 ? this.grades[k] < 60.0 : 0,
       sum: this.grades[k],
       avg: 0
       });
   }
}
Prepare reduce function
function reduce(key, values){
   result = {total: 0, pass: 0, fail: 0, sum: 0, avg: 0};
   values.forEach(function(value){
       result.total += value.total;
       result.pass += value.pass;
       result.fail += value.fail;
       result.sum += value.sum;
   });
   return result;
}
Execute your 1st mapreduce call
 db.grades.mapReduce(
   map,
   reduce,
   {out:{inline: 1}}
)
Add finalize function
function finalize(key, value){
   value.avg = value.sum / value.total;
   return value;
}
Run mapreduce again with finalize
 db.grades.mapReduce(
   map,
   reduce,
   {out:{inline: 1}, finalize: finalize}
)
Mapreduce output options
● {replace: <result collection name>}
  Replace result collection if already existed.
● {merge: <result collection name>}
  Always overwrite with new results.
● {reduce: <result collection name>}
  Run reduce if same key exists in both
  old/current result collections. Will run
  finalize function if any.
● {inline: 1}
  Put result in memory
Other mapreduce output options
● db- put result collection in different
  database
● sharded - output collection will be sharded
  using key = _id
● nonAtomic - partial reduce result will be
  visible will processing.
MongoDB backup & restore
● mongodump
  mongodump -h localhost:27017
● mongorestore
  mongorestore -h localhost:27017 --drop
● mongoexport
  mongoexport -d test -c students -h
  localhost:27017 > students.json
● mongoimport
  mongoimport -d test -c students -h
  localhost:27017 < students.json
Conclusion - Pros of MongoDB
●   Agile (Schemaless)
●   Easy to use
●   Built in replica & sharding
●   Mapreduce with sharding
Conclusion - Cons of MongoDB
● Schemaless = everyone need to know how
  data look like
● Waste of spaces on keys
● Eats lots of memory
● Mapreduce is hard to handle
Cautions of MongoDB
● Global write lock
  ○ Add more RAM
  ○ Use newer version (MongoDB 2.2 now has DB level
    global write lock)
  ○ Split your database properly
● Remove document won't free disk spaces
  ○ You need run compact command periodically
● Don't let your MongoDB data disk full
  ○ Once freespace of disk used by MongoDB if full, you
    won't be able to move/delete document in it.

Mais conteúdo relacionado

Mais procurados

The Aggregation Framework
The Aggregation FrameworkThe Aggregation Framework
The Aggregation FrameworkMongoDB
 
Aggregation Framework
Aggregation FrameworkAggregation Framework
Aggregation FrameworkMongoDB
 
Dev Jumpstart: Schema Design Best Practices
Dev Jumpstart: Schema Design Best PracticesDev Jumpstart: Schema Design Best Practices
Dev Jumpstart: Schema Design Best PracticesMongoDB
 
Schema Design with MongoDB
Schema Design with MongoDBSchema Design with MongoDB
Schema Design with MongoDBrogerbodamer
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation FrameworkCaserta
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBantoinegirbal
 
Apache Solr lessons learned
Apache Solr lessons learnedApache Solr lessons learned
Apache Solr lessons learnedJeroen Rosenberg
 
MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...
MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...
MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...MongoDB
 
Introduction to MongoDB at IGDTUW
Introduction to MongoDB at IGDTUWIntroduction to MongoDB at IGDTUW
Introduction to MongoDB at IGDTUWAnkur Raina
 
MySQL Without The SQL -- Oh My! PHP Detroit July 2018
MySQL Without The SQL -- Oh My! PHP Detroit July 2018MySQL Without The SQL -- Oh My! PHP Detroit July 2018
MySQL Without The SQL -- Oh My! PHP Detroit July 2018Dave Stokes
 
2011 Mongo FR - Indexing in MongoDB
2011 Mongo FR - Indexing in MongoDB2011 Mongo FR - Indexing in MongoDB
2011 Mongo FR - Indexing in MongoDBantoinegirbal
 
Schema Design (Mongo Austin)
Schema Design (Mongo Austin)Schema Design (Mongo Austin)
Schema Design (Mongo Austin)MongoDB
 
Data Modeling for the Real World
Data Modeling for the Real WorldData Modeling for the Real World
Data Modeling for the Real WorldMike Friedman
 
Working with the Web: 
Decoding JSON
Working with the Web: 
Decoding JSONWorking with the Web: 
Decoding JSON
Working with the Web: 
Decoding JSONSV.CO
 
Webinar: General Technical Overview of MongoDB for Dev Teams
Webinar: General Technical Overview of MongoDB for Dev TeamsWebinar: General Technical Overview of MongoDB for Dev Teams
Webinar: General Technical Overview of MongoDB for Dev TeamsMongoDB
 
Indexing with MongoDB
Indexing with MongoDBIndexing with MongoDB
Indexing with MongoDBMongoDB
 
Webinar: Working with Graph Data in MongoDB
Webinar: Working with Graph Data in MongoDBWebinar: Working with Graph Data in MongoDB
Webinar: Working with Graph Data in MongoDBMongoDB
 

Mais procurados (19)

The Aggregation Framework
The Aggregation FrameworkThe Aggregation Framework
The Aggregation Framework
 
Aggregation Framework
Aggregation FrameworkAggregation Framework
Aggregation Framework
 
Dev Jumpstart: Schema Design Best Practices
Dev Jumpstart: Schema Design Best PracticesDev Jumpstart: Schema Design Best Practices
Dev Jumpstart: Schema Design Best Practices
 
Schema Design with MongoDB
Schema Design with MongoDBSchema Design with MongoDB
Schema Design with MongoDB
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation Framework
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Apache Solr lessons learned
Apache Solr lessons learnedApache Solr lessons learned
Apache Solr lessons learned
 
MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...
MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...
MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...
 
Introduction to MongoDB at IGDTUW
Introduction to MongoDB at IGDTUWIntroduction to MongoDB at IGDTUW
Introduction to MongoDB at IGDTUW
 
MySQL Without The SQL -- Oh My! PHP Detroit July 2018
MySQL Without The SQL -- Oh My! PHP Detroit July 2018MySQL Without The SQL -- Oh My! PHP Detroit July 2018
MySQL Without The SQL -- Oh My! PHP Detroit July 2018
 
Mongo db queries
Mongo db queriesMongo db queries
Mongo db queries
 
2011 Mongo FR - Indexing in MongoDB
2011 Mongo FR - Indexing in MongoDB2011 Mongo FR - Indexing in MongoDB
2011 Mongo FR - Indexing in MongoDB
 
Schema Design (Mongo Austin)
Schema Design (Mongo Austin)Schema Design (Mongo Austin)
Schema Design (Mongo Austin)
 
Data Modeling for the Real World
Data Modeling for the Real WorldData Modeling for the Real World
Data Modeling for the Real World
 
Mongo db
Mongo dbMongo db
Mongo db
 
Working with the Web: 
Decoding JSON
Working with the Web: 
Decoding JSONWorking with the Web: 
Decoding JSON
Working with the Web: 
Decoding JSON
 
Webinar: General Technical Overview of MongoDB for Dev Teams
Webinar: General Technical Overview of MongoDB for Dev TeamsWebinar: General Technical Overview of MongoDB for Dev Teams
Webinar: General Technical Overview of MongoDB for Dev Teams
 
Indexing with MongoDB
Indexing with MongoDBIndexing with MongoDB
Indexing with MongoDB
 
Webinar: Working with Graph Data in MongoDB
Webinar: Working with Graph Data in MongoDBWebinar: Working with Graph Data in MongoDB
Webinar: Working with Graph Data in MongoDB
 

Semelhante a Mongo db

MongoDB - Javascript for your Data
MongoDB - Javascript for your DataMongoDB - Javascript for your Data
MongoDB - Javascript for your DataPaulo Fagundes
 
Working with JSON Data in PostgreSQL vs. MongoDB
Working with JSON Data in PostgreSQL vs. MongoDBWorking with JSON Data in PostgreSQL vs. MongoDB
Working with JSON Data in PostgreSQL vs. MongoDBScaleGrid.io
 
PHP Development With MongoDB
PHP Development With MongoDBPHP Development With MongoDB
PHP Development With MongoDBFitz Agard
 
PHP Development with MongoDB (Fitz Agard)
PHP Development with MongoDB (Fitz Agard)PHP Development with MongoDB (Fitz Agard)
PHP Development with MongoDB (Fitz Agard)MongoSF
 
Mongo Presentation by Metatagg Solutions
Mongo Presentation by Metatagg SolutionsMongo Presentation by Metatagg Solutions
Mongo Presentation by Metatagg SolutionsMetatagg Solutions
 
Search Engine-Building with Lucene and Solr
Search Engine-Building with Lucene and SolrSearch Engine-Building with Lucene and Solr
Search Engine-Building with Lucene and SolrKai Chan
 
MongoDB - Features and Operations
MongoDB - Features and OperationsMongoDB - Features and Operations
MongoDB - Features and Operationsramyaranjith
 
Introduction to MongoDB and Workshop
Introduction to MongoDB and WorkshopIntroduction to MongoDB and Workshop
Introduction to MongoDB and WorkshopAhmedabadJavaMeetup
 
10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data ModelingDATAVERSITY
 
Schema Design
Schema DesignSchema Design
Schema DesignMongoDB
 
Your Database Cannot Do this (well)
Your Database Cannot Do this (well)Your Database Cannot Do this (well)
Your Database Cannot Do this (well)javier ramirez
 
Webscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Webscale PostgreSQL - JSONB and Horizontal Scaling StrategiesWebscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Webscale PostgreSQL - JSONB and Horizontal Scaling StrategiesJonathan Katz
 
Schema design mongo_boston
Schema design mongo_bostonSchema design mongo_boston
Schema design mongo_bostonMongoDB
 
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial IndexesBack to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial IndexesMongoDB
 
PHP Machinist Presentation
PHP Machinist PresentationPHP Machinist Presentation
PHP Machinist PresentationAdam Englander
 

Semelhante a Mongo db (20)

Mongo DB 102
Mongo DB 102Mongo DB 102
Mongo DB 102
 
MongoDB - Javascript for your Data
MongoDB - Javascript for your DataMongoDB - Javascript for your Data
MongoDB - Javascript for your Data
 
Working with JSON Data in PostgreSQL vs. MongoDB
Working with JSON Data in PostgreSQL vs. MongoDBWorking with JSON Data in PostgreSQL vs. MongoDB
Working with JSON Data in PostgreSQL vs. MongoDB
 
PHP Development With MongoDB
PHP Development With MongoDBPHP Development With MongoDB
PHP Development With MongoDB
 
PHP Development with MongoDB (Fitz Agard)
PHP Development with MongoDB (Fitz Agard)PHP Development with MongoDB (Fitz Agard)
PHP Development with MongoDB (Fitz Agard)
 
Mongo Presentation by Metatagg Solutions
Mongo Presentation by Metatagg SolutionsMongo Presentation by Metatagg Solutions
Mongo Presentation by Metatagg Solutions
 
Latinoware
LatinowareLatinoware
Latinoware
 
Search Engine-Building with Lucene and Solr
Search Engine-Building with Lucene and SolrSearch Engine-Building with Lucene and Solr
Search Engine-Building with Lucene and Solr
 
MongoDB - Features and Operations
MongoDB - Features and OperationsMongoDB - Features and Operations
MongoDB - Features and Operations
 
MongoDB
MongoDB MongoDB
MongoDB
 
Introduction to MongoDB and Workshop
Introduction to MongoDB and WorkshopIntroduction to MongoDB and Workshop
Introduction to MongoDB and Workshop
 
10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling
 
Querying mongo db
Querying mongo dbQuerying mongo db
Querying mongo db
 
Mongo db basics
Mongo db basicsMongo db basics
Mongo db basics
 
Schema Design
Schema DesignSchema Design
Schema Design
 
Your Database Cannot Do this (well)
Your Database Cannot Do this (well)Your Database Cannot Do this (well)
Your Database Cannot Do this (well)
 
Webscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Webscale PostgreSQL - JSONB and Horizontal Scaling StrategiesWebscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Webscale PostgreSQL - JSONB and Horizontal Scaling Strategies
 
Schema design mongo_boston
Schema design mongo_bostonSchema design mongo_boston
Schema design mongo_boston
 
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial IndexesBack to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
 
PHP Machinist Presentation
PHP Machinist PresentationPHP Machinist Presentation
PHP Machinist Presentation
 

Último

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 

Último (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

Mongo db

  • 1. MongoDB http://tinyurl.com/97o49y3 by toki
  • 2. About me ● Delta Electronic CTBD Senior Engineer ● Main developer of http://loltw.net ○ Website built via MongoDB with daily 600k PV ○ Data grow up everyday with auto crawler bots
  • 3. MongoDB - Simple Introduction ● Document based NOSQL(Not Only SQL) database ● Started from 2007 by 10Gen company ● Wrote in C++ ● Fast (But takes lots of memory) ● Stores JSON documents in BSON format ● Full index on any document attribute ● Horizontal scalability with auto sharding ● High availability & replica ready
  • 4. What is database? ● Raw data ○ John is a student, he's 12 years old. ● Data ○ Student ■ name = "John" ■ age = 12 ● Records ○ Student(name="John", age=12) ○ Student(name="Alice", age=11) ● Database ○ Student Table ○ Grades Table
  • 5. Example of (relational) database Student Grade Grade ID StudentID Student Grade Student ID Grade Name Grade ID Age Name Class ID Class Class ID Name
  • 6. SQL Language - How to find data? ● Find student name is John ○ select * from student where name="John" ● Find class name of John ○ select s.name, c.name as class_name from student s, class c where name="John" and s.class_id=c. class_id
  • 7. Why NOSQL? ● Big data ○ Morden data size is too big for single DB server ○ Google search engine ● Connectivity ○ Facebook like button ● Semi-structure data ○ Car equipments database ● High availability ○ The basic of cloud service
  • 8. Common NOSQL DB characteristic ● Schemaless ● No join, stores pre-joined/embedded data ● Horizontal scalability ● Replica ready - High availability
  • 9. Common types of NOSQL DB ● Key-Value ○ Based on Amazon's Dynamo paper ○ Stores K-V pairs ○ Example: ■ Dynomite ■ Voldemort
  • 10. Common types of NOSQL DB ● Bigtable clones ○ Based on Google Bigtable paper ○ Column oriented, but handles semi-structured data ○ Data keyed by: row, column, time, index ○ Example: ■ Google Big Table ■ HBase ■ Cassandra(FB)
  • 11. Common types of NOSQL DB ● Document base ○ Stores multi-level K-V pairs ○ Usually use JSON as document format ○ Example: ■ MongoDB ■ CounchDB (Apache) ■ Redis
  • 12. Common types of NOSQL DB ● Graph ○ Focus on modeling the structure of data - interconnectivity ○ Example ■ Neo4j ■ AllegroGraph
  • 13. Start using MongoDB - Installation ● From apt-get (debian / ubuntu only) ○ sudo apt-get install mongodb ● Using 10-gen mongodb repository ○ http://docs.mongodb.org/manual/tutorial/install- mongodb-on-debian-or-ubuntu-linux/ ● From pre-built binary or source ○ http://www.mongodb.org/downloads ● Note: 32-bit builds limited to around 2GB of data
  • 14. Manual start your MongoDB mkdir -p /tmp/mongo mongod --dbpath /tmp/mongo or mongod -f mongodb.conf
  • 15. Verify your MongoDB installation $ mongo MongoDB shell version: 2.2.0 connecting to: test >_ -------------------------------------------------------- mongo localhost/test2 mongo 127.0.0.1/test
  • 16. How many database do you have? show dbs
  • 17. Elements of MongoDB ● Database ○ Collection ■ Document
  • 18. What is JSON ● JavaScript Object Notation ● Elements of JSON { ○ Object: K/V pairs "key1": "value1", ○ Key, String "key2": 2.0 ○ Value, could be "key3": [1, "str", 3.0], ■ string "key4": false, ■ bool "key5": { "name": "another object", ■ number } ■ array } ■ object ■ null
  • 19. Another sample of JSON { "name": "John", "age": 12, "grades": { "math": 4.0, "english": 5.0 }, "registered": true, "favorite subjects": ["math", "english"] }
  • 20. Insert document into MongoDB s={ "name": "John", "age": 12, "grades": { "math": 4.0, "english": 5.0 }, "registered": true, "favorite subjects": ["math", "english"] } db.students.insert(s);
  • 21. Verify inserted document db.students.find() also try db.student.insert(s) show collections
  • 22. Save document into MongoDB s.name = "Alice" s.age = 14 s.grades.math = 2.0 db.students.save(s)
  • 23. What is _id / ObjectId ? ● _id is the default primary key for indexing documents, could be any JSON acceptable value. ● By default, MongoDB will auto generate a ObjectId as _id ● ObjectId is 12 bytes value of unique document _id ● Use ObjectId().getTimestamp() to restore the timestamp in ObjectId 0 1 2 3 4 5 6 7 8 9 10 11 unix timestamp machine process id Increment
  • 24. Save document with id into MongoDB s.name = "Bob" s.age = 11 s['favorite subjects'] = ["music", "math", "art"] s.grades.chinese = 3.0 s._id = 1 db.students.save(s)
  • 25. Save document with existing _id delete s.registered db.students.save(s)
  • 26. How to find documents? ● db.xxxx.find() ○ list all documents in collection ● db.xxxx.find( find spec, //how document looks like find fields, //which parts I wanna see ... ) ● db.xxxx.findOne() ○ only returns first document match find spec.
  • 27. find by id db.students.find({_id: 1}) db.students.find({_id: ObjectId('xxx....')})
  • 28. find and filter return fields db.students.find({_id: 1}, {_id: 1}) db.students.find({_id: 1}, {name: 1}) db.students.find({_id: 1}, {_id: 1, name: 1}) db.students.find({_id: 1}, {_id: 0, name: 1})
  • 29. find by name - equal or not equal db.students.find({name: "John"}) db.students.find({name: "Alice"}) db.students.find({name: {$ne: "John"}}) ● $ne : not equal
  • 30. find by name - ignorecase ($regex) db.students.find({name: "john"}) => X db.students.find({name: /john/i}) => O db.students.find({ name: { $regex: "^b", $options: "i" } })
  • 31. find by range of names - $in, $nin db.students.find({name: {$in: ["John", "Bob"]}}) db.students.find({name: {$nin: ["John", "Bob"]}}) ● $in : in range (array of items) ● $nin : not in range
  • 32. find by age - $gt, $gte, $lt, $lte db.students.find({age: {$gt: 12}}) db.students.find({age: {$gte: 12}}) db.students.find({age: {$lt: 12}}) db.students.find({age: {$lte: 12}}) ● $gt : greater than ● $gte : greater than or equal ● $lt : lesser than ● $lte : lesser or equal
  • 33. find by field existence - $exists db.students.find({registered: {$exists: true}}) db.students.find({registered: {$exists: false}})
  • 34. find by field type - $type db.students.find({_id: {$type: 7}}) db.students.find({_id: {$type: 1}}) 1 Double 11 Regular expression 2 String 13 JavaScript code 3 Object 14 Symbol 4 Array 15 JavaScript code with scope 5 Binary Data 16 32 bit integer 7 Object id 17 Timestamp 8 Boolean 18 64 bit integer 9 Date 255 Min key 10 Null 127 Max key
  • 35. find in multi-level fields db.students.find({"grades.math": {$gt: 2.0}}) db.students.find({"grades.math": {$gte: 2.0}})
  • 36. find by remainder - $mod db.students.find({age: {$mod: [10, 2]}}) db.students.find({age: {$mod: [10, 3]}})
  • 37. find in array - $size db.students.find( {'favorite subjects': {$size: 2}} ) db.students.find( {'favorite subjects': {$size: 3}} )
  • 38. find in array - $all db.students.find({'favorite subjects': { $all: ["music", "math", "art"] }}) db.students.find({'favorite subjects': { $all: ["english", "math"] }})
  • 39. find in array - find value in array db.students.find( {"favorite subjects": "art"} ) db.students.find( {"favorite subjects": "math"} )
  • 40. find with bool operators - $and, $or db.students.find({$or: [ {age: {$lt: 12}}, {age: {$gt: 12}} ]}) db.students.find({$and: [ {age: {$lt: 12}}, {age: {$gte: 11}} ]})
  • 41. find with bool operators - $and, $or db.students.find({$and: [ {age: {$lt: 12}}, {age: {$gte: 11}} ]}) equals to db.student.find({age: {$lt:12, $gte: 11}}
  • 42. find with bool operators - $not $not could only be used with other find filter X db.students.find({registered: {$not: false}}) O db.students.find({registered: {$ne: false}}) O db.students.find({age: {$not: {$gte: 12}}})
  • 43. find with JavaScript- $where db.students.find({$where: "this.age > 12"}) db.students.find({$where: "this.grades.chinese" })
  • 44. find cursor functions ● count db.students.find().count() ● limit db.students.find().limit(1) ● skip db.students.find().skip(1) ● sort db.students.find().sort({age: -1}) db.students.find().sort({age: 1})
  • 45. combine find cursor functions db.students.find().skip(1).limit(1) db.students.find().skip(1).sort({age: -1}) db.students.find().skip(1).limit(1).sort({age: -1})
  • 46. more cursor functions ● snapshot ensure cursor returns ○ no duplicates ○ misses no object ○ returns all matching objects that were present at the beginning and the end of the query. ○ usually for export/dump usage
  • 47. more cursor functions ● batchSize tell MongoDB how many documents should be sent to client at once ● explain for performance profiling ● hint tell MongoDB which index should be used for querying/sorting
  • 48. list current running operations ● list operations db.currentOP() ● cancel operations db.killOP()
  • 49. MongoDB index - when to use index? ● while doing complicate find ● while sorting lots of data
  • 50. MongoDB index - sort() example for (i=0; i<1000000; i++){ db.many.save({value: i}); } db.many.find().sort({value: -1}) error: { "$err" : "too much data for sort() with no index. add an index or specify a smaller limit", "code" : 10128 }
  • 51. MongoDB index - how to build index db.many.ensureIndex({value: 1}) ● Index options ○ background ○ unique ○ dropDups ○ sparse
  • 52. MongoDB index - index commands ● list index db.many.getIndexes() ● drop index db.many.dropIndex({value: 1}) db.many.dropIndexes() <-- DANGER!
  • 53. MongoDB Index - find() example db.many.dropIndex({value: 1}) db.many.find({value: 5555}).explain() db.many.ensureIndex({value: 1}) db.many.find({value: 5555}).explain()
  • 54. MongoDB Index - Compound Index db.xxx.ensureIndex({a:1, b:-1, c:1}) query/sort with fields ● a ● a, b ● a, b, c will be accelerated by this index
  • 55. Remove/Drop data from MongoDB ● Remove db.many.remove({value: 5555}) db.many.find({value: 5555}) db.many.remove() ● Drop db.many.drop() ● Drop database db.dropDatabase() EXTREMELY DANGER!!!
  • 56. How to update data in MongoDB Easiest way: s = db.students.findOne({_id: 1}) s.registered = true db.students.save(s)
  • 57. In place update - update() update( {find spec}, {update spec}, upsert=false) db.students.update( {_id: 1}, {$set: {registered: false}} )
  • 58. Update a non-exist document db.students.update( {_id: 2}, {name: 'Mary', age: 9}, true ) db.students.update( {_id: 2}, {$set: {name: 'Mary', age: 9}}, true )
  • 59. set / unset field value db.students.update({_id: 1}, {$set: {"age": 15}}) db.students.update({_id: 1}, {$set: {registered: {2012: false, 2011:true} }}) db.students.update({_id: 1}, {$unset: {registered: 1}})
  • 60. increase/decrease value db.students.update({_id: 1}, { $inc: { "grades.math": 1.1, "grades.english": -1.5, "grades.history": 3.0 } })
  • 61. push value(s) into array db.students.update({_id: 1},{ $push: {tags: "lazy"} }) db.students.update({_id: 1},{ $pushAll: {tags: ["smart", "cute"]} })
  • 62. add only not exists value to array db.students.update({_id: 1},{ $push: {tags: "lazy"} }) db.students.update({_id: 1},{ $addToSet:{tags: "lazy"} }) db.students.update({_id: 1},{ $addToSet:{tags: {$each: ["tall", "thin"]}} })
  • 63. remove value from array db.students.update({_id: 1},{ $pull: {tags: "lazy"} }) db.students.update({_id: 1},{ $pull: {tags: {$ne: "smart"}} }) db.students.update({_id: 1},{ $pullAll: {tags: ["lazy", "smart"]} })
  • 64. pop value from array a = []; for(i=0;i<20;i++){a.push(i);} db.test.save({_id:1, value: a}) db.test.update({_id: 1}, { $pop: {value: 1} }) db.test.update({_id: 1}, { $pop: {value: -1} })
  • 65. rename field db.test.update({_id: 1}, { $rename: {value: "values"} })
  • 66. Practice: add comments to student Add a field into students ({_id: 1}): ● field name: comments ● field type: array of dictionary ● field content: ○ { by: author name, string text: content of comment, string } ● add at least 3 comments to this field
  • 67. Example answer to practice db.students.update({_id: 1}, { $addToSet: { comments: {$each: [ {by: "teacher01", text: "text 01"}, {by: "teacher02", text: "text 02"}, {by: "teacher03", text: "text 03"}, ]}} })
  • 68. The $ position operator (for array) db.students.update({ _id: 1, "comments.by": "teacher02" }, { $inc: {"comments.$.vote": 1} })
  • 69. Atomically update - findAndModify ● Atomically update SINGLE DOCUMENT and return it ● By default, returned document won't contain the modification made in findAndModify command.
  • 70. findAndModify parameters db.xxx.findAndModify({ query: filter to query sort: how to sort and select 1st document in query results remove: set true if you want to remove it update: update content new: set true if you want to get the modified object fields: which fields to fetch upsert: create object if not exists })
  • 71. GridFS ● MongoDB has 32MB document size limit ● For storing large binary objects in MongoDB ● GridFS is kind of spec, not implementation ● Implementation is done by MongoDB drivers ● Current supported drivers: ○ PHP ○ Java ○ Python ○ Ruby ○ Perl
  • 72. GridFS - command line tools ● List mongofiles list ● Put mongofiles put xxx.txt ● Get mongofiles get xxx.txt
  • 73. MongoDB config - basic ● dbpath ○ Which folder to put MongoDB database files ○ MongoDB must have write permission to this folder ● logpath, logappend ○ logpath = log filename ○ MongoDB must have write permission to log file ● bind_ip ○ IP(s) MongoDB will bind with, by default is all ○ User comma to separate more than 1 IP ● port ○ Port number MongoDB will use ○ Default port = 27017
  • 74. Small tip - rotate MongoDB log db.getMongo().getDB("admin").runCommand ("logRotate")
  • 75. MongoDB config - journal ● journal ○ Set journal on/off ○ Usually you should keep this on
  • 76. MongoDB config - http interface ● nohttpinterface ○ Default listen on http://localhost:28017 ○ Shows statistic info with http interface ● rest ○ Used with httpinterface option enabled only ○ Example: http://localhost:28017/test/students/ http://localhost:28017/test/students/? filter_name=John
  • 77. MongoDB config - authentication ● auth ○ By default, MongoDB runs with no authentication ○ If no admin account is created, you could login with no authentication through local mongo shell and start managing user accounts.
  • 78. MongoDB account management ● Add admin user > mongo localhost/admin db.addUser("testadmin", "1234") ● Authenticated as admin user use admin db.auth("testadmin", "1234")
  • 79. MongoDB account management ● Add user to test database use test db.addUser("testrw", "1234") ● Add read only user to test database db.addUser("testro", "1234", true) ● List users db.system.users.find() ● Remove user db.removeUser("testro")
  • 80. MongoDB config - authentication ● keyFile ○ At least 6 characters and size smaller than 1KB ○ Used only for replica/sharding servers ○ Every replica/sharding server should use the same key file for communication ○ On U*ix system, file permission to key file for group/everyone must be none, or MongoDB will refuse to start
  • 81. MongoDB configuration - Replica Set ● replSet ○ Indicate the replica set name ○ All MongoDB in same replica set should use the same name ○ Limitation ■ Maximum 12 nodes in a single replica set ■ Maximum 7 nodes can vote ○ MongoDB replica set is Eventually consistent
  • 82. How's MongoDB replica set working? ● Each a replica set has single primary (master) node and multiple slave nodes ● Data will only be wrote to primary node then will be synced to other slave nodes. ● Use getLastError() for confirming previous write operation is committed to whole replica set, otherwise the write operation may be rolled back if primary node is down before sync.
  • 83. How's MongoDB replica set working? ● Once primary node is down, the whole replica set will be marked as fail and can't do any operation on it until the other nodes vote and elect a new primary node. ● During failover, any write operation not committed to whole replica set will be rolled back
  • 84. Simple replica set configuration mkdir -p /tmp/db01 mkdir -p /tmp/db02 mkdir -p /tmp/db03 mongod --replSet test --port 29001 --dbpath /tmp/db01 mongod --replSet test --port 29002 --dbpath /tmp/db02 mongod --replSet test --port 29003 --dbpath /tmp/db03
  • 85. Simple replica set configuration mongo localhost:29001
  • 86. Another way to config replica set rs.initiate() rs.add("localhost:29001") rs.add("localhost:29002") rs.add("localhost:29003")
  • 87. Extra options for setting replica set ● arbiterOnly ○ Arbiter nodes don't receive data, can't become primary node but can vote. ● priority ○ Node with priority 0 will never be elected as primary node. ○ Higher priority nodes will be preferred as primary ○ If you want to force some node become primary node, do not update node's vote result, update node's priority value and reconfig replica set. ● buildIndexes ○ Can only be set to false on nodes with priority 0 ○ Use false for backup only nodes
  • 88. Extra options for setting replica set ● hidden ○ Nodes marked with hidden option will not be exposed to MongoDB clients. ○ Nodes marked with hidden option will not receive queries. ○ Only use this option for nodes with usage like reporting, integration, backup, etc. ● slaveDelay ○ How many seconds slave nodes could fall behind to primary nodes ○ Can only be set on nodes with priority 0 ○ Used for preventing some human errors
  • 89. Extra options for setting replica set ● vote If set to 1, this node can vote, else not.
  • 90. Change primary node at runtime config = rs.conf() config.members[1].priority = 2 rs.reconfig(config)
  • 91. What is sharding? Name Value A value Alice value to value Amy value F value Bob value G value : value to value : value N value : value : value O value Yoko value to value Zeus value Z value
  • 93. Elements of MongoDB sharding cluster ● Config Server Storing sharding cluster metadata ● mongos Router Routing database operations to correct shard server ● Shard Server Hold real user data
  • 94. Sharding config - config server ● Config server is a MongoDB instance runs with --configsrv option ● Config servers will automatically synced by mongos process, so DO NOT run them with --replSet option ● Synchronous replication protocol is optimized for three machines.
  • 95. Sharding config - mongos Router ● Use mongos (not mongod) for starting a mongos router ● mongos routes database operations to correct shard servers ● Exmaple command for starting mongos mongos --configdb db01, db02, db03 ● With --chunkSize option, you could specify a smaller sharding chunk if you're just testing.
  • 96. Sharding config - shard server ● Shard server is a MongoDB instance runs with --shardsvr option ● Shard server don't need to know where config server / mongos route is
  • 97. Example script for building MongoDB shard cluster mkdir -p /tmp/s00 mkdir -p /tmp/s01 mkdir -p /tmp/s02 mkdir -p /tmp/s03 mongod --configsvr --port 29000 --dbpath /tmp/s00 mongos --configdb localhost:29000 --chunkSize 1 --port 28000 mongod --shardsvr --port 29001 --dbpath /tmp/s01 mongod --shardsvr --port 29002 --dbpath /tmp/s02 mongod --shardsvr --port 29003 --dbpath /tmp/s03
  • 98. Sharding config - add shard server mongo localhost:28000/admin db.runCommand({addshard: "localhost:29001"}) db.runCommand({addshard: "localhost:29002"}) db.runCommand({addshard: "localhost:29003"}) db.printShardingStatus() db.runCommand( { enablesharding : "test" } ) db.runCommand( {shardcollection: "test.shardtest", key: {_id: 1}, unique: true})
  • 99. Let us insert some documents use test for (i=0; i<1000000; i++) { db.shardtest.insert({value: i}); }
  • 100. Remove 1 shard & see what happens use admin db.runCommand({removeshard: "shard0002"}) Let's add it back db.runCommand({addshard: "localhost: 29003"})
  • 101. Pick your sharding key wisely ● Sharding key can not be changed after sharding enabled ● For updating any document in a sharding cluster, sharding key MUST BE INCLUDED as find spec EX: sharding key= {name: 1, class: 1} db.xxx.update({name: "xxxx", class: "ooo},{ ..... update spec })
  • 102. Pick your sharding key wisely ● Sharding key will strongly affect your data distribution model EX: sharding by ObjectId shard001 => data saved 2 months ago shard002 => data saved 1 months ago shard003 => data saved recently
  • 103. Other sharding key examples EX: sharding by Username shard001 => Username starts with a to k shard002 => Username starts with l to r shard003 => Username starts with s to z EX: sharding by md5 completely random distribution
  • 104. What is Mapreduce? ● Map then Reduce ● Map is the procedure to call a function for emitting keys & values sending to reduce function ● Reduce is the procedure to call a function for reducing the emitted keys & values sent via map function into single reduced result. ● Example: map students grades and reduce into total students grades.
  • 105. How to call mapreduce in MongoDB db.xxx.mapreduce( map function, reduce function,{ out: output option, query: query filter, optional, sort: sort filter, optional, finalize: finalize function, .... etc })
  • 106. Let's generate some data for (i=0; i<10000; i++){ db.grades.insert({ grades: { math: Math.random() * 100 % 100, art: Math.random() * 100 % 100, music: Math.random() * 100 % 100 } }); }
  • 107. Prepare Map function function map(){ for (k in this.grades){ emit(k, {total: 1, pass: 1 ? this.grades[k] >= 60.0 : 0, fail: 1 ? this.grades[k] < 60.0 : 0, sum: this.grades[k], avg: 0 }); } }
  • 108. Prepare reduce function function reduce(key, values){ result = {total: 0, pass: 0, fail: 0, sum: 0, avg: 0}; values.forEach(function(value){ result.total += value.total; result.pass += value.pass; result.fail += value.fail; result.sum += value.sum; }); return result; }
  • 109. Execute your 1st mapreduce call db.grades.mapReduce( map, reduce, {out:{inline: 1}} )
  • 110. Add finalize function function finalize(key, value){ value.avg = value.sum / value.total; return value; }
  • 111. Run mapreduce again with finalize db.grades.mapReduce( map, reduce, {out:{inline: 1}, finalize: finalize} )
  • 112. Mapreduce output options ● {replace: <result collection name>} Replace result collection if already existed. ● {merge: <result collection name>} Always overwrite with new results. ● {reduce: <result collection name>} Run reduce if same key exists in both old/current result collections. Will run finalize function if any. ● {inline: 1} Put result in memory
  • 113. Other mapreduce output options ● db- put result collection in different database ● sharded - output collection will be sharded using key = _id ● nonAtomic - partial reduce result will be visible will processing.
  • 114. MongoDB backup & restore ● mongodump mongodump -h localhost:27017 ● mongorestore mongorestore -h localhost:27017 --drop ● mongoexport mongoexport -d test -c students -h localhost:27017 > students.json ● mongoimport mongoimport -d test -c students -h localhost:27017 < students.json
  • 115. Conclusion - Pros of MongoDB ● Agile (Schemaless) ● Easy to use ● Built in replica & sharding ● Mapreduce with sharding
  • 116. Conclusion - Cons of MongoDB ● Schemaless = everyone need to know how data look like ● Waste of spaces on keys ● Eats lots of memory ● Mapreduce is hard to handle
  • 117. Cautions of MongoDB ● Global write lock ○ Add more RAM ○ Use newer version (MongoDB 2.2 now has DB level global write lock) ○ Split your database properly ● Remove document won't free disk spaces ○ You need run compact command periodically ● Don't let your MongoDB data disk full ○ Once freespace of disk used by MongoDB if full, you won't be able to move/delete document in it.