1. Adding Riak to your
NoSQL Bag of Tricks
NoSQL-NYC, October 2010
Alexander Sicular
@siculars
2. Riak, eh?
• Dynamo inspired
• Homogeneous
• Single key-space
• Distributed
• Replicated
• Predictable scaleability
• Data agnostic
3. Origins
Show me your friends...
• Amazon’s Dynamo
http://www.allthingsdistributed.com/2007/10/
amazons_dynamo.html
• Akamai
http://www.basho.com/bios.html
Paramount Home Video
4. CAP Theorem
http://en.wikipedia.org/wiki/CAP_theorem
• Consistency
• Availability
• Partition tolerance
Pick two?
Riak says: pick two at a time.
http://guide.couchdb.org/draft/consistency.html
5. Homogeneous
• Every node is the same
• Any node can service
any request
• Nodes gossip on their
own port
6. One Ring to Rule Them All
Single 160 bit key space
Huh?
No Sharding!
7. Distributed (!= replicated)
• riak is not sharded
★Considerations:
• vnodes = units of -must plan maximum ring
distribution size
• vnodes != physical -think about number of
nodes (pnodes) vnodes per pnode
• vnodes map to pnodes -generally no less than 10
• data is distributed at vnodes per pnode
the vnode level
9. Replicated (!= distributed)
• configurable replication values (“N”)
• configurable consistency and availability
values at read and write time
- read
- write
- durable write
10. Predictable Scaleability
• How much performance per node?
• Scale in both directions
>bin/riak-admin
>Usage: riak-admin { join |
leave | backup | restore |
test | status | reip |
js_reload | wait-for-service
| ringready | transfers }
11. Data Agnostic
• schemaless
• data objects may be of any type
• binary, text (json, xml)
• use content types
>curl -v -d 'this is a test' -H "Content-Type: text/plain"
http://127.0.0.1:8098/riak/testBucket/testKey
15. Bitcask
• Riak’s default disk backend
• Write Only Log
• Heavy updates will grow your footprint
- Look into compaction/merging settings
• Keys are cached in memory with disk offsets
https://spreadsheets.google.com/ccc?
key=0Ak4OBkABJPsxdEowYXc2akxnYU9xNkJmbmZscnhaTFE&hl=en&authkey=CMHw8tYO
17. Ok sounds good.
How do I get it?
>hg clone http://
bitbucket.org/basho/riak
>cd riak
>make all && make rel
OR if you’re on a mac:
>brew install riak
18. What does that get
me?
• Fully functional
• Self contained (<3)
• Default configuration
-64 vnodes, “riak” cookie, N = 3
22. Links
• Lightweight Graphing
• Practical limitations re. number of links per
object
• Unidirectional object linking
• relationship modeling (one to one, one to many)
• Returns “Content-Type: multipart/mixed;”
- Library needs to be multipart aware
- nodejs, formidable
23. Link Walking
First level depth
>curl http://localhost:8098/riak/myBucket/myKey/_,_,_
Via Map/Reduce
>$ curl -X POST -H "content-type:application/json"
http://localhost:8098/mapred --data @-
{"inputs":[["myBucket","myKey"]],"query":[{"link":{}},{"map":
{"language":"javascript","source":"function(v)
{ return [v]; }"}}]}
^D
N level depth
>curl http://localhost:8098/riak/myBucket/myKey/_,_,_/_,_,_
More Info:
http://blog.basho.com/2010/02/24/link-walking-by-example/
http://wiki.basho.com/display/RIAK/Links
http://wiki.basho.com/display/RIAK/REST+API#RESTAPI-Linkwalking
24. Map/Reduce
• Functions written in either Erlang or
JavaScript
• Map is distributed to where the data lives
• Reduce is run on the node coordinating the
M/R
• Erlang > JavaScript
• Tweak JavaScript settings in app.conf
25. M/R in Riak
• An input to start from
function(v, keydata, args) {
• bucket
if (v.values) {
var ret = [], o = {};
• list of keys
o = Riak.mapValuesJson(v)[0];
o.lastModifiedParsed = Date.parse(v["values"][0]["metadata"]
["X-Riak-Last-Modified"]);
★ keys > bucket
o.key = v["key"];
ret.push(o);
• possible link phase
return ret;
} else {
return [];
• one or more map phases
};
}
• (many) possible reduce phase(s)
Map = SQL Where clause
Reduce = SQL Aggregates (SUM, COUNT, GROUP BY)
26. Pre/Post Commit
Hooks
• Pre Commit • Post Commit
• JavaScript or Erlang • Erlang
• Validation • Indexing
• Modify data • Messaging
• Kill writes
27. Code demo
• nodejs
• riak-js
• redis
• simple post site
• tags
• json data passing
28. Chief complaints
• No index
• No native sort
• No increment
• No full text search *
*Yet ;) inc Riak Search!
http://www.slideshare.net/rklophaus/riak-search-erlang-factory-london-2010
29. Hybrid architectures
are the future!
Use tools like Redis to augment shortcomings!
30. 1,456,023 Or “A Lot”
• At scale, precision does
not matter in practice.
• Google
• Twitter
http://photography.nationalgeographic.com/photography/enlarge/
okavango-cape-buffalo_pod_image.html