This presentation introduces people to Cassandra and Column Family Datastores in general. I will discuss what Cassandra is, how and when it is useful, and how it integrates with Rails. I will also go in to lessons learned during our 3-month project, and the useful patterns that emerged. The discussion will be very technical, but targeted at developers who are not familiar with, or have not done a project with Cassandra.
3. What is Cassandra?
Column Family database
Distributed design - “eventually consistent”
Open sourced by Facebook, now Apache
Used by Facebook, Twitter, Digg, Rackspace...
Largest cluster: 100 TB, 150 nodes
Still very immature - at version 0.6.2
4. Relational vs Column Family
Schema-ful Schema-less (mostly)
Row-based Column-based
Robust SQL queries No query language
Transactional Eventually consistent *
ODBC/JDBC Thrift *
Fast - 300/350ms Blazing - .12/15ms **
* Cassandra-specific
** MySQL vs Cassandra, > 50GB data, write/read
6. What about ORM?
Cassandra is NOT a relational database
No ActiveRecord support (currently)
No Hibernate support (currently)
OCFM? Lots of room for jars/gems
7. An example in Rails
bitchroom - a place to bitch and whine
Twitter-like features - user post timelines
Digg-like features - up/down, fav, reply