2. Me
‘boorad’ most places (twitter, github, etc.)
Erlang Programmer
Cloudant BigCouch, Ericsson Monaco, Verdeeco
Java, Python, D, Javascript, Common Lisp
NoSQL East - October 2009
Data Warehousing / Big Data
pre-lunch talks... always.
7. Seriously, you don’t...
Vastly different performance characteristics
Immature APIs and tools / ecosystems
Bugs, most are actively being developed
Your situation doesn’t warrant it
8. Why do they exist?
Every one of these new data storage systems
came from a particular pain someone was
having.
Each system was created to specifically solve
the pain point the authors were experiencing.
This pain usually involves a metric shit-tonne of
data and distributed processing is required.
Schema-free
18. Dynamo - how does it work?
N=3
W=2
Node 1
26 No
de A B C D de
No B
2
C
B C
A D
Z E
C N
od
e
D 3
E
F
D
No
de
E
4
F
G
17
19. Dynamo - how does it work?
PUT http://boorad.cloudant.com/dbname/blah?w=2
N=3
W=2
Node 1
26 No
de A B C D de
No B
2
C
B C
A D
Z E
C N
od
e
D 3
E
F
D
No
de
E
4
F
G
17
20. Dynamo - how does it work?
PUT http://boorad.cloudant.com/dbname/blah?w=2
N=3
W=2
Node 1
26 No
de A B C D de
No B
2
C
B C
A D
Z E
C N
od
e
D 3
E
F
D
No
de
E
4
F
G
17
21. Dynamo - how does it work?
PUT http://boorad.cloudant.com/dbname/blah?w=2
N=3
W=2
Node 1
26 No
de A B C D de
No B
2
C
B C
A D
Z hash(blah) E
C N
od
e
D 3
E
F
D
No
de
E
4
F
G
17
22. Dynamo - how does it work?
PUT http://boorad.cloudant.com/dbname/blah?w=2
N=3
W=2
Node 1
26 No
de A B C D de
No B
2
C
B C
A D
Z hash(blah) E
C N
od
e
D 3
E
F
D
No
de
E
4
F
G
17
23. CAP Theorem
Pick Two (at any given time)
Consistency
Availability
Partition Tolerance
CP refuses requests, AP eventually consistent
Must Read: http://codahale.com/you-cant-
sacrifice-partition-tolerance/
29. Disk Data Structure
btree - many different kinds
mmap - compact bson
memtable/sstable or log structured merge tree
log-structured linear hashing
adjacency lists / adjacency matrices
30. Querying NoSQL
Key Lookups
fast, easy, limiting
Secondary Indexes
Immature part of most systems
Roll your own
MapReduce
Mongo query language
31. Polyglot Persistence
RDBMS
batch processes
Cache
Raw
Hadoop NoSQL Apps
Data
NoSQL