Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
On Rails with Apache Cassandra
1. On Rails with Apache Cassandra
Austin on Rails
April 27th 2010
Stu Hood (@stuhood) – Technical Lead, Rackspace
2. My, what a large/volatile dataset you
have!
● Large
● Larger than 1 node can handle
● Volatile
● More than 25% (ish) writes
● (but still larger than available memory)
● Expensive
● More than you can afford with a commercial
solution
3. My, what a large/volatile dataset you
have!
● For example:
● Event/log data
● Output of batch processing or log analytics jobs
● Social network relationships/updates
● In general:
● Large volume of high fanout data
4. Conversely...
● If your pattern easily fits one RDBMS machine:
● Don't Use Cassandra
● Possibly consider MongoDB, CouchDB, Neo4j,
Redis, etc
– For schema freedom and flexibility
5. Case Study: Digg
1.Vertical partitioning and master/slave trees
2.Developed sharding solution
● IDDB
● Awkward replication, fragile scaling
3.Began populating Cassandra in parallel
● Initial dataset for 'green badges'
– 3 TB
– 76 billion kv pairs
● Most applications being ported to Cassandra
7. Standing on the shoulders of:
Amazon Dynamo
● No node in the cluster is special
● No special roles
● No scaling bottlenecks
● No single point of failure
● Techniques
● Gossip
● Eventual consistency
8. Standing on the shoulders of:
Google Bigtable
● “Column family” data model
● Range queries for rows:
● Scan rows in order
● Memtable/SSTable structure
● Always writes sequentially to disk
● Bloom filters to minimize random reads
● Trounces B-Trees for big data
– Linear insert performance
– Log growth for reads
9. Enter Cassandra
● Hybrid of ancestors
● Adopts listed features
● And adds:
● Pluggable partitioning
● Multi datacenter
support
– Pluggable locality
awareness
● Datamodel
improvements
10. Enter Cassandra
● Project status
● Open sourced by Facebook in 2008 (no longer active)
● Apache License, Version 2.0
● Graduated to Apache TLP February 2010
● Major releases: 0.3 through 0.6.1 (0.7 this summer)
● cassandra.apache.org
● Known deployments at:
● Cloudkick, Digg, Mahalo, SimpleGeo, Twitter,
Rackspace, Reddit
11. The Datamodel
Cluster
Nodes have Tokens:
OrderPreservingPartitioner:
Actual keys
RandomPartitioner:
MD5s of keys
15. The Datamodel
Cluster > Keyspace > Column Family > Row > “Column”
Not like an RDBMS column:
Attribute of the row: each row can
contain millions of different columns
…
Name → Value
bytes → bytes
+version timestamp
17. StatusApp Example
<ColumnFamily Name=”Users”>
● Unique id as key: name->value pairs contain
user attributes
{key: “rails_user”, row: {“fullname”: “Damon
Clinkscales”, “joindate”: “back_in_the_day” … }}
18. StatusApp Example
<ColumnFamily Name=”Timelines”>
● User id and timeline name as key: row contains
list of updates from that timeline
{key: “user19:personal”, row: {<timeuuid1>:
“status19”, <timeuuid2>: “status21”, … }}
19. Raw Client API
● Thrift RPC framework
● Generates client bindings for (almost) any language
1. Get the most recent status in a timeline:
● get_slice(keyspace, key, [column_family,
column_name], predicate, consistency_level)
● get_slice(“statusapp”, “userid19:personal”,
[“Timelines”], {start: ””, count: 1}, QUORUM)
> <timeuuid1>: “status19”
20. But...
● Don't use the Raw Thrift API!
● You won't enjoy it
● Use high level Client APIs
● Many options for each language
21. Consistency Levels?
● Eventual consistency
● Synch to Washington, asynch to Hong Kong
● Client API Tunables
● Synchronously write to W replicas
● Confirm R replicas match at read time
● of N total replicas
● Allows for almost-strong consistency
● When W + R > N
22. Write Example
Replication Factor == N == 3:
3 Copies
24. Write Example
cl.ONE:
W == 1
Block for success on 1 replica
25. Write Example
cl.QUORUM:
W == N/2+1
Block for success on a majority
26. Caveat consumptor
● No secondary indexes:
● Typically implemented in client libraries
● No transactions
● But atomic increment/decrement RSN
● Absolutely no joins
● You don't really want 'em anyway
29. Cassandra Ruby Support: RDF.rb
● Repository implementation for RDF.rb
● Stores triple of (subject, predicate, object) as
(rowkey, name, subname)
30. Silver linings: Ops
● Dead drive?
● Swap the drive, restart, run 'repair'
● Streams missing data from other replicas
● Dead node?
● Start a new node with the same IP and token, run
'repair'
31. Silver linings: Ops
● Need N new nodes?
● Start more nodes with the same config file
● New nodes request load information from the
cluster and join with a token that balances the
cluster
32. Silver linings: Ops
● Adding a datacenter?
● Configure “dc/rack/ip” describing node location
● Add new nodes as before