4. ● User Data Service
● User Discovery Service
● Persistent Session Store
● Message History
● Location-based Discovery
Cassandra in
eBuddy Messaging Platform
5. ● Current size of data
● 1,4 TiB total (replication of 3x); 467 GiB actual data
● 12 million sessions (11 million users plus groups)
● Almost a billion rows in one column family
(inverse social graph)
Some Statistics
7. Design Objectives
● Data Source Agnostic
● Testable
● Thread Safe
● Strong Typing
● Supports “transactions”, i.e. units of work in batch
● Efficient Mapping to Application Domain Model
● Follows Familiar Patterns (e.g. Spring JDBC Template)
15. Data Access Object
● Data Access Object (DAO) is singleton
● Transforms from data model to domain model
● Operations object configured with serializers to convert
from data model to domain model
● Defines the mappers for read operations
17. CQL3
DataStax:
"We believe that CQL3 is a simpler and overall better API for Cassandra
than the thrift API is. Therefore, new projects/applications are encouraged
to use CQL3"
At eBuddy, we are still using the Thrift API and the Java Hector library.
We are currently looking at CQL3 and whether we want to use it going
forward and whether we will "upgrade" existing code.
18. Structured Data
● Object Mapping Frameworks
● Mapped vs. Embedded Objects
● Nested Properties ("path" access)
19. Object Mapping Frameworks
● Simple mapper frameworks with (some) JPA support
● Hector Object Mapper
● Kundera
● Firebrand (not JPA)
● has most features,
e.g supports both embedded and mapped object graphs
https://github.com/impetus-opensource/Kundera
http://github.com/hector-client/hector
http://firebrandocm.org
20. Hierarchical Properties
● Use DynamicComposites to model keys that have a
variable number of components
put(“accounts|msn|x.y.z|sign_in”, “0”);
put(“accounts|msn|x.y.z|key”, “value”);
get(“accounts”) --> retrieved as a map:
{"accounts":
{ "msn":
{ "x.y.z":
{ "sign_in": "0",
"key": "value" } } } }
● Use a slice query to retrieve properties using partial path: