Cassandra stores data differently than traditional RDBMS’s. It is these differences that allow for improvements in performance, availability and scalability. Aaron Morton, DataStax MVP for Apache Cassandra will present the basics of the data model and outline the differences clearly. This webinar is 101 level and is suitable for people who are coming from a relational background and just starting to get into Apache Cassandra.
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
C*ollege Credit: Data Modeling for Apache Cassandra
1. DATASTAX C*OLLEGE CREDIT:
DATA MODELLING FOR
APACHE CASSANDRA
Aaron Morton
Apache Cassandra Committer, Data Stax MVP for Apache Cassandra
@aaronmorton
www.thelastpickle.com
Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License
16. Our Keyspace
CREATE KEYSPACE
cass_college
WITH
strategy_class = 'NetworkTopologyStrategy'
AND
strategy_options:datacenter1 = 1;
17. Table is
a sparse collection of well
known, ordered columns.
18. First Table
CREATE TABLE User
(
user_name text,
password text,
real_name text,
PRIMARY KEY (user_name)
);
19. Some users...
cqlsh:cass_college> INSERT INTO User
... (user_name, password, real_name)
... VALUES
... ('fred', 'sekr8t', 'Mr Foo');
cqlsh:cass_college> select * from User;
user_name | password | real_name
-----------+----------+-----------
fred | sekr8t | Mr Foo
20. Some users...
cqlsh:cass_college> INSERT INTO User
... (user_name, password)
... VALUES
... ('bob', 'pwd');
cqlsh:cass_college> select * from User where user_name =
'bob';
user_name | password | real_name
-----------+----------+-----------
bob | pwd | null
27. UserTweets Table...
cqlsh:cass_college> INSERT INTO UserTweets
... (tweet_id, body, user_name, timestamp)
... VALUES
... (1, 'The Tweet','fred',1352150816917);
cqlsh:cass_college> select * from UserTweets where
user_name='fred';
user_name | tweet_id | body | timestamp
-----------+----------+-----------+--------------------------
fred | 1 | The Tweet | 2012-11-06 10:26:56+1300
28. UserTweets Table...
cqlsh:cass_college> select * from UserTweets where
user_name='fred' and tweet_id=1;
user_name | tweet_id | body | timestamp
-----------+----------+-----------+--------------------------
fred | 1 | The Tweet | 2012-11-06 10:26:56+1300
29. UserTweets Table...
cqlsh:cass_college> INSERT INTO UserTweets
... (tweet_id, body, user_name, timestamp)
... VALUES
... (2, 'Second Tweet', 'fred', 1352150816918);
cqlsh:cass_college> select * from UserTweets where user_name = 'fred';
user_name | tweet_id | body | timestamp
-----------+----------+--------------+--------------------------
fred | 1 | The Tweet | 2012-11-06 10:26:56+1300
fred | 2 | Second Tweet | 2012-11-06 10:26:56+1300
30. UserTweets Table...
cqlsh:cass_college> select * from UserTweets where user_name = 'fred' order by
tweet_id desc;
user_name | tweet_id | body | timestamp
-----------+----------+--------------+--------------------------
fred | 2 | Second Tweet | 2012-11-06 10:26:56+1300
fred | 1 | The Tweet | 2012-11-06 10:26:56+1300
32. Data Model (so far)
CF / User User
User Tweet
Value Tweets Timeline
user_name Primary Key Field Primary Key Primary Key
Primary Key Primary Key
tweet_id Primary Key
Component Component
34. UserMetrics Table...
cqlsh:cass_college> UPDATE
... UserMetrics
... SET
... tweets = tweets + 1
... WHERE
... user_name = 'fred';
cqlsh:cass_college> select * from UserMetrics where user_name
= 'fred';
user_name | followers | following | tweets
-----------+-----------+-----------+--------
fred | null | null | 1
35. Data Model (so far)
CF / User User
User Tweet User Metrics
Value Tweets Timeline
Primary Primary Primary Primary
user_name Field
Key Key Key Key
Primary Primary Key Primary Key
tweet_id
Key Component Component
37. Relationships
INSERT INTO
Following
(user_name, following, timestamp)
VALUES
('bob', 'fred', 1352247749161);
INSERT INTO
Followers
(user_name, follower, timestamp)
VALUES
('fred', 'bob', 1352247749161);
38. Relationships
cqlsh:cass_college> select * from Following;
user_name | following | timestamp
-----------+-----------+--------------------------
bob | fred | 2012-11-07 13:22:29+1300
cqlsh:cass_college> select * from Followers;
user_name | follower | timestamp
-----------+----------+--------------------------
fred | bob | 2012-11-07 13:22:29+1300
39. Data Model
CF / User User User Follows
User Tweet
Value Tweets Timeline Metrics Followers
Primary
Primary Primary Primary Primary
user_name Field Key
Key Key Key Key
Field
Primary Primary Key Primary Key
tweet_id
Key Component Component