SlideShare uma empresa Scribd logo
1 de 29
Baixar para ler offline
#CASSANDRA13
Patrick McFadin | Solution Architect, DataStax
The World's Next Top Data Model
Monday, June 24, 13
#CASSANDRA13
The saga continues!
★ Data model is dead, long live the data
model.
★ Bridging from Relational to Cassandra
★ Become a Super Modeler
★ Core data modeling techniques using
CQL
Monday, June 24, 13
#CASSANDRA13
Because I love talking about this
Just to recap...
Monday, June 24, 13
#CASSANDRA13
Why does this matter?
* Cassandra lives closer to your users or applications
* Not a hammer for all use case nails
* Proper use case, proper model...
* Get it wrong and...
Monday, June 24, 13
#CASSANDRA13
When to use Cassandra*
* Need to be in more than one datacenter. active-active
* Scaling from 0 to, uh, well... we’re not really sure.
* Need as close to 100% uptime as possible.
* Getting these from any other solution would just be mega $
and...
*nutshell version. These are all ORs not ANDs
Monday, June 24, 13
#CASSANDRA13
You get the data
model right!
Monday, June 24, 13
#CASSANDRA13
So let’s do that
* Four real world examples
* Use case, what they were avoiding and model to accomplish
* You may think this is you, but it isn’t. I hear these all the time.
* All examples are in CQL3
Monday, June 24, 13
#CASSANDRA13
But wait you say
CQL doesn’t do dynamic wide rows!
Monday, June 24, 13
#CASSANDRA13
Yes it can!
* CQL does wide rows the same way you did them in Thrift
* No really
* Read this blog post
http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows
...or just trust me and I’ll show you how
Monday, June 24, 13
#CASSANDRA13
Customers giving you money is a good reason for uptime
Shopping Cart Data Model
Monday, June 24, 13
#CASSANDRA13
Shopping cart use case
* Store shopping cart data reliably
* Minimize (or eliminate) downtime. Multi-dc
* Scale for the “Cyber Monday” problem
* Every minute off-line is lost $$
* Online shoppers want speed!
The bad
Monday, June 24, 13
#CASSANDRA13
Shopping cart data model
* Each customer can have
one or more shopping carts
* De-normalize data for fast
access
* Shopping cart == One
partition (Row Level
Isolation)
* Each item a new column
Monday, June 24, 13
#CASSANDRA13
Shopping cart data model
CREATE TABLE user (
! username varchar,
! firstname varchar,
! lastname varchar,
! shopping_carts set<varchar>,
! PRIMARY KEY (username)
);
CREATE TABLE shopping_cart (
! username varchar,
! cart_name text
! item_id int,
! item_name varchar,
description varchar,
! price float,
! item_detail map<varchar,varchar>
! PRIMARY KEY ((username,cart_name),item_id)
);
INSERT INTO shopping_cart
(username,cart_name,item_id,item_name,description,price,item_detail)
VALUES ('pmcfadin','Gadgets I want',8675309,'Garmin
910XT','Multisport training watch',349.99,
{'Related':'Timex sports watch',
'Volume Discount':'10'});
INSERT INTO shopping_cart
(username,cart_name,item_id,item_name,description,price,item_detail)
VALUES ('pmcfadin','Gadgets I want',9748575,'Polaris Foot
Pod','Bluetooth Smart foot pod',64.00
{'Related':'Timex foot pod',
'Volume Discount':'25'});
One partition (storage row) of data
Item details. Flexible for whatev
Partition row key for one users cart
Creates partition row key
Monday, June 24, 13
#CASSANDRA13
Watching users, making decisions. Freaky, but cool.
User Activity Tracking
Monday, June 24, 13
#CASSANDRA13
User activity use case
* React to user input in real time
* Support for multiple application pods
* Scale for speed
* Losing interactions is costly
* Waiting for batch(hadoop) is to long
The bad
Monday, June 24, 13
#CASSANDRA13
User activity data model
* Interaction points stored per
user in short table
* Long term interaction stored
in similar table with date
partition
* Process long term later
using batch
* Reverse time series to get
last N items
Monday, June 24, 13
#CASSANDRA13
User activity data model
CREATE TABLE user_activity (
! username varchar,
! interaction_time timeuuid,
! activity_code varchar,
! detail varchar,
! PRIMARY KEY (username, interaction_time)
) WITH CLUSTERING ORDER BY (interaction_time DESC);
CREATE TABLE user_activity_history (
! username varchar,
! interaction_date varchar,
! interaction_time timeuuid,
! activity_code varchar,
! detail varchar,
! PRIMARY KEY ((username,interaction_date),interaction_time)
);
INSERT INTO user_activity
(username,interaction_time,activity_code,detail)
VALUES ('pmcfadin',0D1454E0-D202-11E2-8B8B-0800200C9A66,'100','Normal
login')
USING TTL 2592000;
INSERT INTO user_activity_history
(username,interaction_date,interaction_time,activity_code,detail)
VALUES ('pmcfadin','20130605',0D1454E0-
D202-11E2-8B8B-0800200C9A66,'100','Normal login');
Reverse order based on timestamp
Expire after 30 days
Monday, June 24, 13
#CASSANDRA13
Data model usage
username | interaction_time | detail | activity_code
----------+--------------------------------------+------------------------------------------+------------------
pmcfadin | 9ccc9df0-d076-11e2-923e-5d8390e664ec | Entered shopping area: Jewelry | 301
pmcfadin | 9c652990-d076-11e2-923e-5d8390e664ec | Created shopping cart: Anniversary gifts | 202
pmcfadin | 1b5cef90-d076-11e2-923e-5d8390e664ec | Deleted shopping cart: Gadgets I want | 205
pmcfadin | 1b0e5a60-d076-11e2-923e-5d8390e664ec | Opened shopping cart: Gadgets I want | 201
pmcfadin | 1b0be960-d076-11e2-923e-5d8390e664ec | Normal login | 100
select * from user_activity limit 5;
Maybe put a sale item for flowers too?
Monday, June 24, 13
#CASSANDRA13
Machines generate logs at a furious pace. Be ready.
Log collection/aggregation
Monday, June 24, 13
#CASSANDRA13
Log collection use case
* Collect log data at high speed
* Cassandra near where logs are generated. Multi-datacenter
* Dice data for various uses. Dashboard. Lookup. Etc.
* The scale needed for RDBMS is cost prohibitive
* Batch analysis of logs too late for some use cases
The bad
Monday, June 24, 13
#CASSANDRA13
Log collection data model
* Use Flume to collect and fan out
data to various tables
* Tables for lookup based on
source and time
* Tables for dashboard with
aggregation and summation
Monday, June 24, 13
#CASSANDRA13
Log collection data model
CREATE TABLE log_lookup (
! source varchar,
! date_to_minute varchar,
! timestamp timeuuid,
! raw_log blob,
! PRIMARY KEY ((source,date_to_minute),timestamp)
);
CREATE TABLE login_success (
! source varchar,
! date_to_minute varchar,
! successful_logins counter,
! PRIMARY KEY (source,date_to_minute)
) WITH CLUSTERING ORDER BY (date_to_minute DESC);
CREATE TABLE login_failure (
! source varchar,
! date_to_minute varchar,
! failed_logins counter,
! PRIMARY KEY (source,date_to_minute)
) WITH CLUSTERING ORDER BY (date_to_minute DESC);
Consider storing raw logs as GZIP
Monday, June 24, 13
#CASSANDRA13
Log dashboard
0
25
50
75
100
10:01 10:03 10:05 10:07 10:09 10:11 10:13 10:15 10:17 10:19
Sucessful Logins
Failed Logins
SELECT date_to_minute,successful_logins
FROM login_success
LIMIT 20;
SELECT date_to_minute,failed_logins
FROM login_failure
LIMIT 20;
Monday, June 24, 13
#CASSANDRA13
Because mistaks mistakes happen
User Form Versioning
Monday, June 24, 13
#CASSANDRA13
Form versioning use case
* Store every possible version efficiently
* Scale to any number of users
* Commit/Rollback functionality on a form
* In RDBMS, many relations that need complicated join
* Needs to be in cloud and local data center
The bad
Monday, June 24, 13
#CASSANDRA13
Form version data model
* Each user has a form
* Each form needs versioning
* Separate table to store live
version
* Exclusive lock on a form
Monday, June 24, 13
#CASSANDRA13
Form version data model
CREATE TABLE working_version (
! username varchar,
! form_id int,
! version_number int,
! locked_by varchar,
! form_attributes map<varchar,varchar>
! PRIMARY KEY ((username, form_id), version_number)
) WITH CLUSTERING ORDER BY (version_number DESC);
INSERT INTO working_version
(username, form_id, version_number, locked_by, form_attributes)
VALUES ('pmcfadin',1138,1,'',
{'FirstName<text>':'First Name: ',
'LastName<text>':'Last Name: ',
'EmailAddress<text>':'Email Address: ',
'Newsletter<radio>':'Y,N'});
UPDATE working_version
SET locked_by = 'pmcfadin'
WHERE username = 'pmcfadin'
AND form_id = 1138
AND version_number = 1;
INSERT INTO working_version
(username, form_id, version_number, locked_by, form_attributes)
VALUES ('pmcfadin',1138,2,null,
{'FirstName<text>':'First Name: ',
'LastName<text>':'Last Name: ',
'EmailAddress<text>':'Email Address: ',
'Newsletter<checkbox>':'Y'});
1. Insert first version
2. Lock for one user
3. Insert new version. Release lock
Monday, June 24, 13
#CASSANDRA13
That’s it!
“Mind what you have learned. Save you it can.”
- Yoda. Master Data Modeler
Monday, June 24, 13
#CASSANDRA13
Your data model is next!
* Try out a few things
* See what works
* All else fails, engage an expert in the community
* Want more? Follow me on twitter: @PatrickMcFadin
Monday, June 24, 13

Mais conteúdo relacionado

Mais procurados

Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDBMongoDB
 
Real data models of silicon valley
Real data models of silicon valleyReal data models of silicon valley
Real data models of silicon valleyPatrick McFadin
 
Proxysql sharding
Proxysql shardingProxysql sharding
Proxysql shardingMarco Tusa
 
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...Altinity Ltd
 
Mongodb - Scaling write performance
Mongodb - Scaling write performanceMongodb - Scaling write performance
Mongodb - Scaling write performanceDaum DNA
 
How Scylla Make Adding and Removing Nodes Faster and Safer
How Scylla Make Adding and Removing Nodes Faster and SaferHow Scylla Make Adding and Removing Nodes Faster and Safer
How Scylla Make Adding and Removing Nodes Faster and SaferScyllaDB
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeperSaurav Haloi
 
Oracle GoldenGate 21c New Features and Best Practices
Oracle GoldenGate 21c New Features and Best PracticesOracle GoldenGate 21c New Features and Best Practices
Oracle GoldenGate 21c New Features and Best PracticesBobby Curtis
 
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...Altinity Ltd
 
Understanding PostgreSQL LW Locks
Understanding PostgreSQL LW LocksUnderstanding PostgreSQL LW Locks
Understanding PostgreSQL LW LocksJignesh Shah
 
Getting the Scylla Shard-Aware Drivers Faster
Getting the Scylla Shard-Aware Drivers FasterGetting the Scylla Shard-Aware Drivers Faster
Getting the Scylla Shard-Aware Drivers FasterScyllaDB
 
MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11
MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11
MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11Kenny Gryp
 
Redefining tables online without surprises
Redefining tables online without surprisesRedefining tables online without surprises
Redefining tables online without surprisesNelson Calero
 
MySQL 8.0 EXPLAIN ANALYZE
MySQL 8.0 EXPLAIN ANALYZEMySQL 8.0 EXPLAIN ANALYZE
MySQL 8.0 EXPLAIN ANALYZENorvald Ryeng
 
MySQL Performance for DevOps
MySQL Performance for DevOpsMySQL Performance for DevOps
MySQL Performance for DevOpsSveta Smirnova
 
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...DataStax
 
Introduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparoundIntroduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparoundMasahiko Sawada
 
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEOClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEOAltinity Ltd
 

Mais procurados (20)

Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDB
 
Become a super modeler
Become a super modelerBecome a super modeler
Become a super modeler
 
Real data models of silicon valley
Real data models of silicon valleyReal data models of silicon valley
Real data models of silicon valley
 
Proxysql sharding
Proxysql shardingProxysql sharding
Proxysql sharding
 
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
 
Mongodb - Scaling write performance
Mongodb - Scaling write performanceMongodb - Scaling write performance
Mongodb - Scaling write performance
 
How Scylla Make Adding and Removing Nodes Faster and Safer
How Scylla Make Adding and Removing Nodes Faster and SaferHow Scylla Make Adding and Removing Nodes Faster and Safer
How Scylla Make Adding and Removing Nodes Faster and Safer
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeper
 
Oracle GoldenGate 21c New Features and Best Practices
Oracle GoldenGate 21c New Features and Best PracticesOracle GoldenGate 21c New Features and Best Practices
Oracle GoldenGate 21c New Features and Best Practices
 
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
 
Understanding PostgreSQL LW Locks
Understanding PostgreSQL LW LocksUnderstanding PostgreSQL LW Locks
Understanding PostgreSQL LW Locks
 
Getting the Scylla Shard-Aware Drivers Faster
Getting the Scylla Shard-Aware Drivers FasterGetting the Scylla Shard-Aware Drivers Faster
Getting the Scylla Shard-Aware Drivers Faster
 
MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11
MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11
MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11
 
Redefining tables online without surprises
Redefining tables online without surprisesRedefining tables online without surprises
Redefining tables online without surprises
 
MySQL 8.0 EXPLAIN ANALYZE
MySQL 8.0 EXPLAIN ANALYZEMySQL 8.0 EXPLAIN ANALYZE
MySQL 8.0 EXPLAIN ANALYZE
 
MySQL Performance for DevOps
MySQL Performance for DevOpsMySQL Performance for DevOps
MySQL Performance for DevOps
 
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
 
MyRocks Deep Dive
MyRocks Deep DiveMyRocks Deep Dive
MyRocks Deep Dive
 
Introduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparoundIntroduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparound
 
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEOClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
 

Semelhante a Cassandra13: Data Modeling Techniques Using CQL

Cassandra Community Webinar | The World's Next Top Data Model
Cassandra Community Webinar | The World's Next Top Data ModelCassandra Community Webinar | The World's Next Top Data Model
Cassandra Community Webinar | The World's Next Top Data ModelDataStax
 
Time series with apache cassandra strata
Time series with apache cassandra   strataTime series with apache cassandra   strata
Time series with apache cassandra strataPatrick McFadin
 
Suicide Risk Prediction Using Social Media and Cassandra
Suicide Risk Prediction Using Social Media and CassandraSuicide Risk Prediction Using Social Media and Cassandra
Suicide Risk Prediction Using Social Media and CassandraKen Krugler
 
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable CassandraCassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandraaaronmorton
 
The Last Pickle: Repeatable, Scalable, Reliable, Observable: Cassandra
The Last Pickle: Repeatable, Scalable, Reliable, Observable: CassandraThe Last Pickle: Repeatable, Scalable, Reliable, Observable: Cassandra
The Last Pickle: Repeatable, Scalable, Reliable, Observable: CassandraDataStax Academy
 
Cassandra Data Modeling
Cassandra Data ModelingCassandra Data Modeling
Cassandra Data ModelingBen Knear
 
Time series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long versionTime series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long versionPatrick McFadin
 
Jonathan Ellis "Apache Cassandra 2.0 and 2.1". Выступление на Cassandra conf ...
Jonathan Ellis "Apache Cassandra 2.0 and 2.1". Выступление на Cassandra conf ...Jonathan Ellis "Apache Cassandra 2.0 and 2.1". Выступление на Cassandra conf ...
Jonathan Ellis "Apache Cassandra 2.0 and 2.1". Выступление на Cassandra conf ...it-people
 
Meetup cassandra for_java_cql
Meetup cassandra for_java_cqlMeetup cassandra for_java_cql
Meetup cassandra for_java_cqlzznate
 
DNN Database Tips & Tricks
DNN Database Tips & TricksDNN Database Tips & Tricks
DNN Database Tips & TricksWill Strohl
 
C* Summit 2013: Optimizing the Public Cloud for Cost and Scalability with Cas...
C* Summit 2013: Optimizing the Public Cloud for Cost and Scalability with Cas...C* Summit 2013: Optimizing the Public Cloud for Cost and Scalability with Cas...
C* Summit 2013: Optimizing the Public Cloud for Cost and Scalability with Cas...DataStax Academy
 
OTN tour 2015 AWR data mining
OTN tour 2015 AWR data miningOTN tour 2015 AWR data mining
OTN tour 2015 AWR data miningAndrejs Vorobjovs
 
Advanced MariaDB features that developers love.pdf
Advanced MariaDB features that developers love.pdfAdvanced MariaDB features that developers love.pdf
Advanced MariaDB features that developers love.pdfFederico Razzoli
 
AWR DB performance Data Mining - Collaborate 2015
AWR DB performance Data Mining - Collaborate 2015AWR DB performance Data Mining - Collaborate 2015
AWR DB performance Data Mining - Collaborate 2015Yury Velikanov
 
Project A Data Modelling Best Practices Part II: How to Build a Data Warehouse?
Project A Data Modelling Best Practices Part II: How to Build a Data Warehouse?Project A Data Modelling Best Practices Part II: How to Build a Data Warehouse?
Project A Data Modelling Best Practices Part II: How to Build a Data Warehouse?Martin Loetzsch
 
Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...
Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...
Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...Codemotion
 
C* Summit 2013: Suicide Risk Prediction Using Social Media and Cassandra by K...
C* Summit 2013: Suicide Risk Prediction Using Social Media and Cassandra by K...C* Summit 2013: Suicide Risk Prediction Using Social Media and Cassandra by K...
C* Summit 2013: Suicide Risk Prediction Using Social Media and Cassandra by K...DataStax Academy
 

Semelhante a Cassandra13: Data Modeling Techniques Using CQL (20)

Cassandra Community Webinar | The World's Next Top Data Model
Cassandra Community Webinar | The World's Next Top Data ModelCassandra Community Webinar | The World's Next Top Data Model
Cassandra Community Webinar | The World's Next Top Data Model
 
Time series with apache cassandra strata
Time series with apache cassandra   strataTime series with apache cassandra   strata
Time series with apache cassandra strata
 
Suicide Risk Prediction Using Social Media and Cassandra
Suicide Risk Prediction Using Social Media and CassandraSuicide Risk Prediction Using Social Media and Cassandra
Suicide Risk Prediction Using Social Media and Cassandra
 
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable CassandraCassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
 
The Last Pickle: Repeatable, Scalable, Reliable, Observable: Cassandra
The Last Pickle: Repeatable, Scalable, Reliable, Observable: CassandraThe Last Pickle: Repeatable, Scalable, Reliable, Observable: Cassandra
The Last Pickle: Repeatable, Scalable, Reliable, Observable: Cassandra
 
Cassandra Data Modeling
Cassandra Data ModelingCassandra Data Modeling
Cassandra Data Modeling
 
ໂປຮແກຮມ MySQL
ໂປຮແກຮມ MySQLໂປຮແກຮມ MySQL
ໂປຮແກຮມ MySQL
 
Time series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long versionTime series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long version
 
Jonathan Ellis "Apache Cassandra 2.0 and 2.1". Выступление на Cassandra conf ...
Jonathan Ellis "Apache Cassandra 2.0 and 2.1". Выступление на Cassandra conf ...Jonathan Ellis "Apache Cassandra 2.0 and 2.1". Выступление на Cassandra conf ...
Jonathan Ellis "Apache Cassandra 2.0 and 2.1". Выступление на Cassandra conf ...
 
Meetup cassandra for_java_cql
Meetup cassandra for_java_cqlMeetup cassandra for_java_cql
Meetup cassandra for_java_cql
 
DNN Database Tips & Tricks
DNN Database Tips & TricksDNN Database Tips & Tricks
DNN Database Tips & Tricks
 
C* Summit 2013: Optimizing the Public Cloud for Cost and Scalability with Cas...
C* Summit 2013: Optimizing the Public Cloud for Cost and Scalability with Cas...C* Summit 2013: Optimizing the Public Cloud for Cost and Scalability with Cas...
C* Summit 2013: Optimizing the Public Cloud for Cost and Scalability with Cas...
 
OTN tour 2015 AWR data mining
OTN tour 2015 AWR data miningOTN tour 2015 AWR data mining
OTN tour 2015 AWR data mining
 
Optimizer Statistics
Optimizer StatisticsOptimizer Statistics
Optimizer Statistics
 
Advanced MariaDB features that developers love.pdf
Advanced MariaDB features that developers love.pdfAdvanced MariaDB features that developers love.pdf
Advanced MariaDB features that developers love.pdf
 
AWR DB performance Data Mining - Collaborate 2015
AWR DB performance Data Mining - Collaborate 2015AWR DB performance Data Mining - Collaborate 2015
AWR DB performance Data Mining - Collaborate 2015
 
Project A Data Modelling Best Practices Part II: How to Build a Data Warehouse?
Project A Data Modelling Best Practices Part II: How to Build a Data Warehouse?Project A Data Modelling Best Practices Part II: How to Build a Data Warehouse?
Project A Data Modelling Best Practices Part II: How to Build a Data Warehouse?
 
Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...
Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...
Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...
 
Relationships are hard
Relationships are hardRelationships are hard
Relationships are hard
 
C* Summit 2013: Suicide Risk Prediction Using Social Media and Cassandra by K...
C* Summit 2013: Suicide Risk Prediction Using Social Media and Cassandra by K...C* Summit 2013: Suicide Risk Prediction Using Social Media and Cassandra by K...
C* Summit 2013: Suicide Risk Prediction Using Social Media and Cassandra by K...
 

Mais de Patrick McFadin

Successful Architectures for Fast Data
Successful Architectures for Fast DataSuccessful Architectures for Fast Data
Successful Architectures for Fast DataPatrick McFadin
 
Open source or proprietary, choose wisely!
Open source or proprietary,  choose wisely!Open source or proprietary,  choose wisely!
Open source or proprietary, choose wisely!Patrick McFadin
 
An Introduction to time series with Team Apache
An Introduction to time series with Team ApacheAn Introduction to time series with Team Apache
An Introduction to time series with Team ApachePatrick McFadin
 
Laying down the smack on your data pipelines
Laying down the smack on your data pipelinesLaying down the smack on your data pipelines
Laying down the smack on your data pipelinesPatrick McFadin
 
Help! I want to contribute to an Open Source project but my boss says no.
Help! I want to contribute to an Open Source project but my boss says no.Help! I want to contribute to an Open Source project but my boss says no.
Help! I want to contribute to an Open Source project but my boss says no.Patrick McFadin
 
Analyzing Time Series Data with Apache Spark and Cassandra
Analyzing Time Series Data with Apache Spark and CassandraAnalyzing Time Series Data with Apache Spark and Cassandra
Analyzing Time Series Data with Apache Spark and CassandraPatrick McFadin
 
Storing time series data with Apache Cassandra
Storing time series data with Apache CassandraStoring time series data with Apache Cassandra
Storing time series data with Apache CassandraPatrick McFadin
 
A Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax EnterpriseA Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax EnterprisePatrick McFadin
 
Cassandra 3.0 advanced preview
Cassandra 3.0 advanced previewCassandra 3.0 advanced preview
Cassandra 3.0 advanced previewPatrick McFadin
 
Advanced data modeling with apache cassandra
Advanced data modeling with apache cassandraAdvanced data modeling with apache cassandra
Advanced data modeling with apache cassandraPatrick McFadin
 
Introduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandraIntroduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandraPatrick McFadin
 
Apache cassandra and spark. you got the the lighter, let's start the fire
Apache cassandra and spark. you got the the lighter, let's start the fireApache cassandra and spark. you got the the lighter, let's start the fire
Apache cassandra and spark. you got the the lighter, let's start the firePatrick McFadin
 
Owning time series with team apache Strata San Jose 2015
Owning time series with team apache   Strata San Jose 2015Owning time series with team apache   Strata San Jose 2015
Owning time series with team apache Strata San Jose 2015Patrick McFadin
 
Nike Tech Talk: Double Down on Apache Cassandra and Spark
Nike Tech Talk:  Double Down on Apache Cassandra and SparkNike Tech Talk:  Double Down on Apache Cassandra and Spark
Nike Tech Talk: Double Down on Apache Cassandra and SparkPatrick McFadin
 
Apache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series dataApache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series dataPatrick McFadin
 
Introduction to cassandra 2014
Introduction to cassandra 2014Introduction to cassandra 2014
Introduction to cassandra 2014Patrick McFadin
 
Making money with open source and not losing your soul: A practical guide
Making money with open source and not losing your soul: A practical guideMaking money with open source and not losing your soul: A practical guide
Making money with open source and not losing your soul: A practical guidePatrick McFadin
 
Cassandra EU - Data model on fire
Cassandra EU - Data model on fireCassandra EU - Data model on fire
Cassandra EU - Data model on firePatrick McFadin
 
Cassandra 2.0 and timeseries
Cassandra 2.0 and timeseriesCassandra 2.0 and timeseries
Cassandra 2.0 and timeseriesPatrick McFadin
 
Cassandra 2.0 better, faster, stronger
Cassandra 2.0   better, faster, strongerCassandra 2.0   better, faster, stronger
Cassandra 2.0 better, faster, strongerPatrick McFadin
 

Mais de Patrick McFadin (20)

Successful Architectures for Fast Data
Successful Architectures for Fast DataSuccessful Architectures for Fast Data
Successful Architectures for Fast Data
 
Open source or proprietary, choose wisely!
Open source or proprietary,  choose wisely!Open source or proprietary,  choose wisely!
Open source or proprietary, choose wisely!
 
An Introduction to time series with Team Apache
An Introduction to time series with Team ApacheAn Introduction to time series with Team Apache
An Introduction to time series with Team Apache
 
Laying down the smack on your data pipelines
Laying down the smack on your data pipelinesLaying down the smack on your data pipelines
Laying down the smack on your data pipelines
 
Help! I want to contribute to an Open Source project but my boss says no.
Help! I want to contribute to an Open Source project but my boss says no.Help! I want to contribute to an Open Source project but my boss says no.
Help! I want to contribute to an Open Source project but my boss says no.
 
Analyzing Time Series Data with Apache Spark and Cassandra
Analyzing Time Series Data with Apache Spark and CassandraAnalyzing Time Series Data with Apache Spark and Cassandra
Analyzing Time Series Data with Apache Spark and Cassandra
 
Storing time series data with Apache Cassandra
Storing time series data with Apache CassandraStoring time series data with Apache Cassandra
Storing time series data with Apache Cassandra
 
A Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax EnterpriseA Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
 
Cassandra 3.0 advanced preview
Cassandra 3.0 advanced previewCassandra 3.0 advanced preview
Cassandra 3.0 advanced preview
 
Advanced data modeling with apache cassandra
Advanced data modeling with apache cassandraAdvanced data modeling with apache cassandra
Advanced data modeling with apache cassandra
 
Introduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandraIntroduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandra
 
Apache cassandra and spark. you got the the lighter, let's start the fire
Apache cassandra and spark. you got the the lighter, let's start the fireApache cassandra and spark. you got the the lighter, let's start the fire
Apache cassandra and spark. you got the the lighter, let's start the fire
 
Owning time series with team apache Strata San Jose 2015
Owning time series with team apache   Strata San Jose 2015Owning time series with team apache   Strata San Jose 2015
Owning time series with team apache Strata San Jose 2015
 
Nike Tech Talk: Double Down on Apache Cassandra and Spark
Nike Tech Talk:  Double Down on Apache Cassandra and SparkNike Tech Talk:  Double Down on Apache Cassandra and Spark
Nike Tech Talk: Double Down on Apache Cassandra and Spark
 
Apache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series dataApache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series data
 
Introduction to cassandra 2014
Introduction to cassandra 2014Introduction to cassandra 2014
Introduction to cassandra 2014
 
Making money with open source and not losing your soul: A practical guide
Making money with open source and not losing your soul: A practical guideMaking money with open source and not losing your soul: A practical guide
Making money with open source and not losing your soul: A practical guide
 
Cassandra EU - Data model on fire
Cassandra EU - Data model on fireCassandra EU - Data model on fire
Cassandra EU - Data model on fire
 
Cassandra 2.0 and timeseries
Cassandra 2.0 and timeseriesCassandra 2.0 and timeseries
Cassandra 2.0 and timeseries
 
Cassandra 2.0 better, faster, stronger
Cassandra 2.0   better, faster, strongerCassandra 2.0   better, faster, stronger
Cassandra 2.0 better, faster, stronger
 

Último

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 

Último (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 

Cassandra13: Data Modeling Techniques Using CQL

  • 1. #CASSANDRA13 Patrick McFadin | Solution Architect, DataStax The World's Next Top Data Model Monday, June 24, 13
  • 2. #CASSANDRA13 The saga continues! ★ Data model is dead, long live the data model. ★ Bridging from Relational to Cassandra ★ Become a Super Modeler ★ Core data modeling techniques using CQL Monday, June 24, 13
  • 3. #CASSANDRA13 Because I love talking about this Just to recap... Monday, June 24, 13
  • 4. #CASSANDRA13 Why does this matter? * Cassandra lives closer to your users or applications * Not a hammer for all use case nails * Proper use case, proper model... * Get it wrong and... Monday, June 24, 13
  • 5. #CASSANDRA13 When to use Cassandra* * Need to be in more than one datacenter. active-active * Scaling from 0 to, uh, well... we’re not really sure. * Need as close to 100% uptime as possible. * Getting these from any other solution would just be mega $ and... *nutshell version. These are all ORs not ANDs Monday, June 24, 13
  • 6. #CASSANDRA13 You get the data model right! Monday, June 24, 13
  • 7. #CASSANDRA13 So let’s do that * Four real world examples * Use case, what they were avoiding and model to accomplish * You may think this is you, but it isn’t. I hear these all the time. * All examples are in CQL3 Monday, June 24, 13
  • 8. #CASSANDRA13 But wait you say CQL doesn’t do dynamic wide rows! Monday, June 24, 13
  • 9. #CASSANDRA13 Yes it can! * CQL does wide rows the same way you did them in Thrift * No really * Read this blog post http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows ...or just trust me and I’ll show you how Monday, June 24, 13
  • 10. #CASSANDRA13 Customers giving you money is a good reason for uptime Shopping Cart Data Model Monday, June 24, 13
  • 11. #CASSANDRA13 Shopping cart use case * Store shopping cart data reliably * Minimize (or eliminate) downtime. Multi-dc * Scale for the “Cyber Monday” problem * Every minute off-line is lost $$ * Online shoppers want speed! The bad Monday, June 24, 13
  • 12. #CASSANDRA13 Shopping cart data model * Each customer can have one or more shopping carts * De-normalize data for fast access * Shopping cart == One partition (Row Level Isolation) * Each item a new column Monday, June 24, 13
  • 13. #CASSANDRA13 Shopping cart data model CREATE TABLE user ( ! username varchar, ! firstname varchar, ! lastname varchar, ! shopping_carts set<varchar>, ! PRIMARY KEY (username) ); CREATE TABLE shopping_cart ( ! username varchar, ! cart_name text ! item_id int, ! item_name varchar, description varchar, ! price float, ! item_detail map<varchar,varchar> ! PRIMARY KEY ((username,cart_name),item_id) ); INSERT INTO shopping_cart (username,cart_name,item_id,item_name,description,price,item_detail) VALUES ('pmcfadin','Gadgets I want',8675309,'Garmin 910XT','Multisport training watch',349.99, {'Related':'Timex sports watch', 'Volume Discount':'10'}); INSERT INTO shopping_cart (username,cart_name,item_id,item_name,description,price,item_detail) VALUES ('pmcfadin','Gadgets I want',9748575,'Polaris Foot Pod','Bluetooth Smart foot pod',64.00 {'Related':'Timex foot pod', 'Volume Discount':'25'}); One partition (storage row) of data Item details. Flexible for whatev Partition row key for one users cart Creates partition row key Monday, June 24, 13
  • 14. #CASSANDRA13 Watching users, making decisions. Freaky, but cool. User Activity Tracking Monday, June 24, 13
  • 15. #CASSANDRA13 User activity use case * React to user input in real time * Support for multiple application pods * Scale for speed * Losing interactions is costly * Waiting for batch(hadoop) is to long The bad Monday, June 24, 13
  • 16. #CASSANDRA13 User activity data model * Interaction points stored per user in short table * Long term interaction stored in similar table with date partition * Process long term later using batch * Reverse time series to get last N items Monday, June 24, 13
  • 17. #CASSANDRA13 User activity data model CREATE TABLE user_activity ( ! username varchar, ! interaction_time timeuuid, ! activity_code varchar, ! detail varchar, ! PRIMARY KEY (username, interaction_time) ) WITH CLUSTERING ORDER BY (interaction_time DESC); CREATE TABLE user_activity_history ( ! username varchar, ! interaction_date varchar, ! interaction_time timeuuid, ! activity_code varchar, ! detail varchar, ! PRIMARY KEY ((username,interaction_date),interaction_time) ); INSERT INTO user_activity (username,interaction_time,activity_code,detail) VALUES ('pmcfadin',0D1454E0-D202-11E2-8B8B-0800200C9A66,'100','Normal login') USING TTL 2592000; INSERT INTO user_activity_history (username,interaction_date,interaction_time,activity_code,detail) VALUES ('pmcfadin','20130605',0D1454E0- D202-11E2-8B8B-0800200C9A66,'100','Normal login'); Reverse order based on timestamp Expire after 30 days Monday, June 24, 13
  • 18. #CASSANDRA13 Data model usage username | interaction_time | detail | activity_code ----------+--------------------------------------+------------------------------------------+------------------ pmcfadin | 9ccc9df0-d076-11e2-923e-5d8390e664ec | Entered shopping area: Jewelry | 301 pmcfadin | 9c652990-d076-11e2-923e-5d8390e664ec | Created shopping cart: Anniversary gifts | 202 pmcfadin | 1b5cef90-d076-11e2-923e-5d8390e664ec | Deleted shopping cart: Gadgets I want | 205 pmcfadin | 1b0e5a60-d076-11e2-923e-5d8390e664ec | Opened shopping cart: Gadgets I want | 201 pmcfadin | 1b0be960-d076-11e2-923e-5d8390e664ec | Normal login | 100 select * from user_activity limit 5; Maybe put a sale item for flowers too? Monday, June 24, 13
  • 19. #CASSANDRA13 Machines generate logs at a furious pace. Be ready. Log collection/aggregation Monday, June 24, 13
  • 20. #CASSANDRA13 Log collection use case * Collect log data at high speed * Cassandra near where logs are generated. Multi-datacenter * Dice data for various uses. Dashboard. Lookup. Etc. * The scale needed for RDBMS is cost prohibitive * Batch analysis of logs too late for some use cases The bad Monday, June 24, 13
  • 21. #CASSANDRA13 Log collection data model * Use Flume to collect and fan out data to various tables * Tables for lookup based on source and time * Tables for dashboard with aggregation and summation Monday, June 24, 13
  • 22. #CASSANDRA13 Log collection data model CREATE TABLE log_lookup ( ! source varchar, ! date_to_minute varchar, ! timestamp timeuuid, ! raw_log blob, ! PRIMARY KEY ((source,date_to_minute),timestamp) ); CREATE TABLE login_success ( ! source varchar, ! date_to_minute varchar, ! successful_logins counter, ! PRIMARY KEY (source,date_to_minute) ) WITH CLUSTERING ORDER BY (date_to_minute DESC); CREATE TABLE login_failure ( ! source varchar, ! date_to_minute varchar, ! failed_logins counter, ! PRIMARY KEY (source,date_to_minute) ) WITH CLUSTERING ORDER BY (date_to_minute DESC); Consider storing raw logs as GZIP Monday, June 24, 13
  • 23. #CASSANDRA13 Log dashboard 0 25 50 75 100 10:01 10:03 10:05 10:07 10:09 10:11 10:13 10:15 10:17 10:19 Sucessful Logins Failed Logins SELECT date_to_minute,successful_logins FROM login_success LIMIT 20; SELECT date_to_minute,failed_logins FROM login_failure LIMIT 20; Monday, June 24, 13
  • 24. #CASSANDRA13 Because mistaks mistakes happen User Form Versioning Monday, June 24, 13
  • 25. #CASSANDRA13 Form versioning use case * Store every possible version efficiently * Scale to any number of users * Commit/Rollback functionality on a form * In RDBMS, many relations that need complicated join * Needs to be in cloud and local data center The bad Monday, June 24, 13
  • 26. #CASSANDRA13 Form version data model * Each user has a form * Each form needs versioning * Separate table to store live version * Exclusive lock on a form Monday, June 24, 13
  • 27. #CASSANDRA13 Form version data model CREATE TABLE working_version ( ! username varchar, ! form_id int, ! version_number int, ! locked_by varchar, ! form_attributes map<varchar,varchar> ! PRIMARY KEY ((username, form_id), version_number) ) WITH CLUSTERING ORDER BY (version_number DESC); INSERT INTO working_version (username, form_id, version_number, locked_by, form_attributes) VALUES ('pmcfadin',1138,1,'', {'FirstName<text>':'First Name: ', 'LastName<text>':'Last Name: ', 'EmailAddress<text>':'Email Address: ', 'Newsletter<radio>':'Y,N'}); UPDATE working_version SET locked_by = 'pmcfadin' WHERE username = 'pmcfadin' AND form_id = 1138 AND version_number = 1; INSERT INTO working_version (username, form_id, version_number, locked_by, form_attributes) VALUES ('pmcfadin',1138,2,null, {'FirstName<text>':'First Name: ', 'LastName<text>':'Last Name: ', 'EmailAddress<text>':'Email Address: ', 'Newsletter<checkbox>':'Y'}); 1. Insert first version 2. Lock for one user 3. Insert new version. Release lock Monday, June 24, 13
  • 28. #CASSANDRA13 That’s it! “Mind what you have learned. Save you it can.” - Yoda. Master Data Modeler Monday, June 24, 13
  • 29. #CASSANDRA13 Your data model is next! * Try out a few things * See what works * All else fails, engage an expert in the community * Want more? Follow me on twitter: @PatrickMcFadin Monday, June 24, 13