5. What is a document db?
• One that stores documents
• Popular options:
• MongoDB -- C++
• CouchDB -- Erlang
• Also Amazon’s SimpleDB
• ...what exactly is a document?
6. In the real world
• (Source: http://guide.couchdb.org/draft/why.html)
7. In terms of JSON
• {name: “John Doe”,
• zip: 10001}
8. What about db schema?
• Schema-less
• Different documents could be stored in a single collection
42. RWN Math
• R – Number of nodes that are read from.
• W – Number of nodes that are written to.
• N – Total number of nodes in the cluster.
• In general: R < N and W < N for higher availability
43. R+W>N
• Easy to determine consistent state
• R + W = 2N
• absolutely consistent, can provide ACID gaurantee
• In all cases when R + W > N there is some overlap between read and write
nodes.
44. R = 1, W = N
• more reads than writes
•W=N
• 1 node failure = entire system unavailable
45. R = N, W =1
•W=N
• Chance of data inconsistency quite high
•R=N
• Read only possible when all nodes in the cluster are available
46. R = W = ceiling ((N + 1)/2)
Effective quorum for eventual consistency
47. Eventual consistency variants
• Causal consistency -- A writes and informs B then B always sees updated
value
• Read-your-writes-consistency -- A writes a new value and never see the old
one
• Session consistency -- read-your-writes-consistency within a client session
• Monotonic read consistency -- once seen a new value, never return previous
value
• Monotonic write consistency -- serialize writes by the same process
48. Dynamo Techniques
• Consistent Hashing (Incremental scalability)
• Vector clocks (high availability for writes)
• Sloppy quorum and hinted handoff (recover from temporary failure)
• Gossip based membership protocol (periodic, pair wise, inter-process
interactions, low reliability, random peer selection)
• Anti-entropy using Merkle trees
• (source: http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-
dynamo-sosp2007.pdf)
50. Vector clocks (a trivial example)
• 4 hackers: Joe, Hillary, Eric and Ajay decide to meetup
• Joe -- suggests Palo Alto (t0)
• Hillary and Eric -- decide to meet in Mountain View (t1)
• Eric and Ajay -- decide to meet in Los Altos (t2)
• Joe mails: PA, Hillary responds: Mtn View, Ajay responds: Los Altos (t3)
• both Hillary and Ajay say: Eric knows
51. Vector clocks (how it works)
• Venue : Palo Alto
• Vector Clock: Joe (ver 1)
• Venue: Mountain View
• Vector Clock: Joe (ver 1), Hillary (ver 1), Eric (ver 1)
• Venue: Los Altos
• Vector Clock: Joe (ver 1), Ajay (ver 1), Eric (ver 1)
52. Vector clock (resolution)
• Venue : Palo Alto
• Vector Clock: Joe (ver 1)
• Venue: Mountain View
• Vector Clock: Joe (ver 1), Hillary (ver 1), Ajay (ver 0), Eric (ver 2)
• Venue: Los Altos
• Vector Clock: Joe (ver 1), Hillary (ver 0), Ajay (ver 1), Eric (ver 1)
55. Redis -- a key-value data structure server
• open source key-value store
• a data structure server
• values in key-value pairs can be strings, hashes, lists, sets, sorted sets
56. Where to find it?
• redis.io
• download a copy from http://redis.io/download
57. Who is building it?
• Core developers
• Salvatore Sanfilippo, twitter: @antirez
• Pieter Noordhuis, twitter: @pnoordhuis
• Main sponsor
• VMware
58. Written in
• ANSI C
• runs on POSIX compliant systems with no external dependencies
59. How can it be used?
• as an in memory data store
• with option to persist to disk
• in standalone mode or as a master-slave replicated set
• Redis cluster -- coming soon! (June 2011)
• as cache
61. Download and install
• curl -O http://redis.googlecode.com/files/redis-2.2.0-rc4.tar.gz
• (just a 436kb download)
• tar zxvf redis-2.2.0-rc4.tar.gz
• cd redis-2.2.0-rc4
• make & make install (installs in /usr/local/bin)
• make test (to be sure you install it correctly)
62. Start the redis-server
• /usr/local/bin/redis-server
• ...Server started, Redis version 2.1.12
• ...The server is now ready to accept connections on port 6379
63. Connect with redis-cli
• /usr/local/bin/redis-cli
• redis> set key1 val1
• OK
• redis> get key1
• "val1"
64. String key-value pairs
• like memcached
• with persistence
• key and value -- binary-safe strings