SlideShare a Scribd company logo
1 of 80
Download to read offline
Cassandra nice use-cases and worst anti-patterns 
DuyHai DOAN, Technical Advocate 
@doanduyhai
Agenda! 
@doanduyhai 
2 
Anti-patterns 
• Queue-like designs 
• CQL null values 
• Intensive update on same column 
• Design around dynamic schema
Agenda! 
@doanduyhai 
3 
Nice use-cases 
• Rate-limiting 
• Anti Fraud 
• Account validation 
• Sensor data timeseries
Worst anti-patterns! 
Queue-like designs! 
CQL null! 
Intensive update on same column! 
Design around dynamic schema! 
!
Failure level! 
@doanduyhai 
5 
☠ 
☠☠ 
☠☠☠ 
☠☠☠☠
Queue-like designs! 
@doanduyhai 
6 
Adding new message ☞ 1 physical insert
Queue-like designs! 
@doanduyhai 
7 
Adding new message ☞ 1 physical insert 
Consuming message = deleting it ☞ 1 physical insert (tombstone)
Queue-like designs! 
@doanduyhai 
8 
Adding new message ☞ 1 physical insert 
Consuming message = deleting it ☞ 1 physical insert (tombstone) 
Transactional queue = re-inserting messages ☞ physical insert * <many>
Queue-like designs! 
FIFO queue 
@doanduyhai 
9 
A 
{ A }
Queue-like designs! 
FIFO queue 
@doanduyhai 
10 
A B 
{ A, B }
Queue-like designs! 
FIFO queue 
@doanduyhai 
11 
A B C 
{ A, B, C }
Queue-like designs! 
FIFO queue 
@doanduyhai 
12 
A B C A 
{ B, C }
Queue-like designs! 
FIFO queue 
@doanduyhai 
13 
A B C A D 
{ B, C, D }
Queue-like designs! 
FIFO queue 
@doanduyhai 
14 
A B C A D B 
{ C, D }
Queue-like designs! 
FIFO queue 
@doanduyhai 
15 
A B C A D B C 
{ D }
Queue-like designs! 
FIFO queue, worst case 
@doanduyhai 
16 
A A A A A A A A A A 
{ }
Failure level! 
@doanduyhai 
17 
☠☠☠
CQL null semantics! 
@doanduyhai 
18 
Reading null value means 
• value does not exist (has never bean created) 
• value deleted (tombstone) 
SELECT age FROM users WHERE login = ddoan; à NULL
CQL null semantics! 
@doanduyhai 
19 
Writing null means 
• delete value (creating tombstone) 
• even though it does not exist 
UPDATE users SET age = NULL WHERE login = ddoan;
CQL null semantics! 
@doanduyhai 
20 
Seen in production: prepared statement 
UPDATE users SET 
age = ?, 
… 
geo_location = ?, 
mood = ?, 
… 
WHERE login = ?;
CQL null semantics! 
@doanduyhai 
21 
Seen in production: bound statement 
preparedStatement.bind(33, …, null, null, null, …); 
null ☞ tombstone creation on each update … 
jdoe 
age name geo_loc mood status 
33 John DOE ý ý ý
Failure level! 
@doanduyhai 
22 
☠
Intensive update! 
@doanduyhai 
23 
Context 
• small start-up 
• cloud-based video recording & alarm 
• internet of things (sensor) 
• 10 updates/sec for some sensors
Intensive update on same column! 
@doanduyhai 
24 
Data model 
sensor_id 
value 
45.0034 
CREATE TABLE sensor_data ( 
sensor_id long, 
value double, 
PRIMARY KEY(sensor_id));
Intensive update on same column! 
UPDATE sensor_data SET value = 45.0034 WHERE sensor_id = …; 
UPDATE sensor_data SET value = 47.4182 WHERE sensor_id = …; 
UPDATE sensor_data SET value = 48.0300 WHERE sensor_id = …; 
@doanduyhai 
25 
Updates 
sensor_id 
value (t1) 
45.0034 
sensor_id 
value (t13) 
47.4182 
sensor_id 
value (t36) 
48.0300
Intensive update on same column! 
@doanduyhai 
26 
Read 
SELECT sensor_value from sensor_data WHERE sensor_id = …; 
read N physical columns, only 1 useful … 
sensor_id 
value (t1) 
45.0034 
sensor_id 
value (t13) 
47.4182 
sensor_id 
value (t36) 
48.0300
Intensive update on same column! 
@doanduyhai 
27 
Solution 1: leveled compaction! (if your I/O can keep up) 
sensor_id 
value (t1) 
45.0034 
sensor_id 
value (t13) 
47.4182 
sensor_id 
value (t36) 
48.0300 
sensor_id 
value (t36) 
48.0300
Intensive update on same column! 
@doanduyhai 
28 
Solution 2: reversed timeseries & DateTiered compaction strategy 
CREATE TABLE sensor_data ( 
sensor_id long, 
date timestamp, 
sensor_value double, 
PRIMARY KEY((sensor_id), date)) 
WITH CLUSTERING ORDER (date DESC);
Intensive update on same column! 
SELECT sensor_value FROM sensor_data WHERE sensor_id = … LIMIT 1; 
@doanduyhai 
29 
sensor_id 
date3(t3) 
date2(t2) 
date1(t1) 
Data cleaning by configuration (max_sstable_age_days) 
... 
48.0300 47.4182 45.0034 …
Failure level! 
@doanduyhai 
30 
☠☠
Design around dynamic schema! 
@doanduyhai 
31 
Customer emergency call 
• 3 nodes cluster almost full 
• impossible to scale out 
• 4th node in JOINING state for 1 week 
• disk space is filling up, production at risk!
Design around dynamic schema! 
@doanduyhai 
32 
After investigation 
• 4th node in JOINING state because streaming is stalled 
• NPE in logs
Design around dynamic schema! 
@doanduyhai 
33 
After investigation 
• 4th node in JOINING state because streaming is stalled 
• NPE in logs 
Cassandra source-code to the rescue
Design around dynamic schema! 
@doanduyhai 
34 
public class CompressedStreamReader extends StreamReader 
{ 
… 
@Override 
public SSTableWriter read(ReadableByteChannel channel) throws IOException 
{ 
… 
Pair<String, String> kscf = Schema.instance.getCF(cfId); 
ColumnFamilyStore cfs = Keyspace.open(kscf.left).getColumnFamilyStore(kscf.right); 
NPE here
Design around dynamic schema! 
@doanduyhai 
35 
The truth is 
• the devs dynamically drop & recreate table every day 
• dynamic schema is in the core of their design 
Example: 
DROP TABLE catalog_127_20140613; 
CREATE TABLE catalog_127_20140614( … );
Design around dynamic schema! 
@doanduyhai 
36 
Failure sequence 
n1 
n2 
n4 
n3 
catalog_x_y 
catalog_x_y 
catalog_x_y 
catalog_x_y 
4 1 
2 
3 
5 
6
Design around dynamic schema! 
@doanduyhai 
37 
Failure sequence 
n1 
n2 
n4 
n3 
catalog_x_y 
catalog_x_y 
catalog_x_y 
catalog_x_y 
4 1 
2 
3 
5 
6 
catalog_x_z 
catalog_x_z 
catalog_x_z 
catalog_x_z
Design around dynamic schema! 
@doanduyhai 
catalog_x_y ???? 
38 
Failure sequence 
n1 
n2 
n4 
n3 
4 1 
2 
3 
5 
6 
catalog_x_z 
catalog_x_z 
catalog_x_z 
catalog_x_z
Design around dynamic schema! 
@doanduyhai 
39 
Consequences 
• joining node got always stuck 
• à cannot extend cluster 
• 
à changing code takes time 
• 
à production in danger (no space left) 
• 
à sacrify analytics data to survive
Design around dynamic schema! 
@doanduyhai 
40 
Nutshell 
• dynamic schema change as normal operations is not recommended 
• concurrent schema AND topology change is an anti-pattern
Failure level! 
@doanduyhai 
41 
☠☠☠☠
! " 
! 
Q & R
Nice Examples! 
Rate limiting! 
Anti Fraud! 
Account Validation! 
Sensor Data Timeseries!
Rate limiting! 
@doanduyhai 
44 
Start-up company, reset password feature 
1) /password/reset 
2) SMS with token A0F83E63DB935465CE73DFE…. 
Phone number Random token 
3) /password/new/<token>/<password>
Rate limiting! 
@doanduyhai 
45 
Problem 1 
• account created with premium phone number
Rate limiting! 
@doanduyhai 
46 
Problem 1 
• account created with premium phone number 
• /password/reset x 100
Rate limiting! 
@doanduyhai 
47 
« money, money, money, give money, in the richman’s world » $$$
Rate limiting! 
@doanduyhai 
48 
Problem 2 
• massive hack
Rate limiting! 
@doanduyhai 
49 
Problem 2 
• massive hack 
• 106 /password/reset calls from few accounts
Rate limiting! 
@doanduyhai 
50 
Problem 2 
• massive hack 
• 106 /password/reset calls from few accounts 
• SMS messages are cheap
Rate limiting! 
@doanduyhai 
51 
Problem 2 
• ☞ but not at the 106/per user/per day scale
Rate limiting! 
@doanduyhai 
52 
Solution 
• premium phone number ☞ Google libphonenumber
Rate limiting! 
@doanduyhai 
53 
Solution 
• premium phone number ☞ Google libphonenumber 
• massive hack ☞ rate limiting with Cassandra
Cassandra Time To Live! 
@doanduyhai 
54 
Time to live 
• built-in feature 
• insert data with a TTL in sec 
• expires server-side automatically 
• ☞ use as sliding-window
Rate limiting in action! 
@doanduyhai 
55 
Implementation 
• threshold = max 3 reset password per sliding 24h
Rate limiting in action! 
@doanduyhai 
56 
Implementation 
• when /password/reset called 
• check threshold 
• reached ☞ error message/ignore 
• not reached ☞ log the attempt with TTL = 86400
Rate limiting 
demo
Anti Fraud! 
@doanduyhai 
58 
Real story 
• many special offers available 
• 30 mins international calls (50 countries) 
• unlimited land-line calls to 5 countries 
• …
Anti Fraud! 
@doanduyhai 
59 
Real story 
• each offer has a duration (week/month/year) 
• only one offer active at a time
Anti Fraud! 
@doanduyhai 
60 
Cassandra TTL 
• check for existing offer before 
SELECT count(*) FROM user_special_offer WHERE login = ‘jdoe’;
Anti Fraud! 
@doanduyhai 
61 
Cassandra TTL 
• then grant new offer 
INSERT INTO user_special_offer(login, offer_code, …) 
VALUES(‘jdoe’, ’30_mins_international’,…) 
USING TTL <offer_duration>;
Account Validation! 
@doanduyhai 
62 
Requirement 
• user creates new account 
• sends sms/email link with token to validate account 
• 10 days to validate
Account Validation! 
@doanduyhai 
63 
How to ? 
• create account with 10 days TTL 
INSERT INTO users(login, name, age) 
VALUES(‘jdoe’, ‘John DOE’, 33) 
USING TTL 864000;
Account Validation! 
@doanduyhai 
64 
How to ? 
• create random token for validation with 10 days TTL 
INSERT INTO account_validation(token, login, name, age) 
VALUES(‘A0F83E63DB935465CE73DFE…’, ‘jdoe’, ‘John DOE’, 33) 
USING TTL 864000;
Account Validation! 
@doanduyhai 
65 
On token validation 
• check token exist & retrieve user details 
SELECT login, name, age FROM account_validation 
WHERE token = ‘A0F83E63DB935465CE73DFE…’; 
• re-insert durably user details without TTL 
INSERT INTO users(login, name, age) VALUES(‘jdoe’, ‘John DOE’, 33);
Sensor Data Timeseries! 
@doanduyhai 
66 
Requirements 
• lots of sensors (103 – 106) 
• medium to high insertion rate (0.1 – 10/secs) 
• keep good load balancing 
• fast read & write
Bucketing! 
@doanduyhai 
67 
CREATE TABLE sensor_data ( 
sensor_id text, 
date timestamp, 
raw_data blob, 
PRIMARY KEY(sensor_id, date)); 
sensor_id 
date1 date2 date3 date4 … 
blob1 blob2 blob3 blob4 …
Bucketing! 
@doanduyhai 
68 
Problems: 
• limit of 2.109 physical columns 
• bad load balancing (1 sensor = 1 node) 
• wide row spans over many files 
sensor_id 
date1 date2 date3 date4 … 
blob1 blob2 blob3 blob4 …
Bucketing! 
@doanduyhai 
69 
Idea: 
• composite partition key: sensor_id:date_bucket 
• tunable date granularity: per hour/per day/per month … 
CREATE TABLE sensor_data ( 
sensor_id text, 
date_bucket int, //format YYYYMMdd 
date timestamp, 
raw_data blob, 
PRIMARY KEY((sensor_id, date_bucket), date));
Bucketing! 
Idea: 
• composite partition key: sensor_id:date_bucket 
• tunable date granularity: per hour/per day/per month … 
@doanduyhai 
70 
sensor_id:2014091014 
date1 date2 date3 date4 … 
blob1 blob2 blob3 blob4 … 
sensor_id:2014091015 
date11 date12 date13 date14 … 
blob11 blob12 blob13 blob14 … 
Buckets
Bucketing! 
@doanduyhai 
71 
Advantage: 
• distribute load: 1 bucket = 1 node 
• limit partition width (max x columns per bucket) 
Buckets 
sensor_id:2014091014 
date1 date2 date3 date4 … 
blob1 blob2 blob3 blob4 … 
sensor_id:2014091015 
date11 date12 date13 date14 … 
blob11 blob12 blob13 blob14 …
Bucketing! 
@doanduyhai 
72 
But how can I select raw data between 14:45 and 15:10 ? 
14:45 à ? 
15:00 à 15:10 
sensor_id:2014091014 
date1 date2 date3 date4 … 
blob1 blob2 blob3 blob4 … 
sensor_id:2014091015 
date11 date12 date13 date14 … 
blob11 blob12 blob13 blob14 …
Bucketing! 
Solution 
• use IN clause on partition key component 
• with range condition on date column 
☞ date column should be monotonic function (increasing/decreasing) 
@doanduyhai 
73 
SELECT * FROM sensor_data WHERE sensor_id = xxx 
AND date_bucket IN (2014091014 , 2014091015) 
AND date >= ‘2014-09-10 14:45:00.000‘ 
AND date <= ‘2014-09-10 15:10:00.000‘
Bucketing Caveats! 
@doanduyhai 
74 
IN clause for #partition is not silver bullet ! 
• use scarcely 
• keep cardinality low (≤ 5) 
n1 
n2 
n3 
n4 
n5 
n6 
n7 
coordinator 
n8 
sensor_id:2014091014 
sensor_id:2014091015
Bucketing Caveats! 
@doanduyhai 
75 
IN clause for #partition is not silver bullet ! 
• use scarcely 
• keep cardinality low (≤ 5) 
• prefer // async queries 
• ease of query vs perf 
n1 
n2 
n3 
n4 
n5 
n6 
n7 
n8 
Async client 
sensor_id:2014091014 
sensor_id:2014091015
! " 
! 
Q & R
Cassandra developers! 
@doanduyhai 
77 
Rule n°1 
If you don’t know, ask for help 
(me, Cassandra ML, PlanetCassandra, stackoverflow, …) 
!
Cassandra developers! 
@doanduyhai 
78 
Rule n°2 
Do not blind-guess troubleshooting 
alone in production 
(ask for help, see rule n°1) 
!
Cassandra developers! 
@doanduyhai 
79 
Rule n°3 
Share with the community 
(your best use-cases … and worst failures) 
! 
http://planetcassandra.org/
Thank You 
@doanduyhai 
duy_hai.doan@datastax.com

More Related Content

What's hot

IoT databases - review and challenges - IoT, Hardware & Robotics meetup - onl...
IoT databases - review and challenges - IoT, Hardware & Robotics meetup - onl...IoT databases - review and challenges - IoT, Hardware & Robotics meetup - onl...
IoT databases - review and challenges - IoT, Hardware & Robotics meetup - onl...Marcin Bielak
 
Streaming SQL with Apache Calcite
Streaming SQL with Apache CalciteStreaming SQL with Apache Calcite
Streaming SQL with Apache CalciteJulian Hyde
 
Scaling for Performance
Scaling for PerformanceScaling for Performance
Scaling for PerformanceScyllaDB
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm Chandler Huang
 
Bucket your partitions wisely - Cassandra summit 2016
Bucket your partitions wisely - Cassandra summit 2016Bucket your partitions wisely - Cassandra summit 2016
Bucket your partitions wisely - Cassandra summit 2016Markus Höfer
 
Iceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data AnalyticsIceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data AnalyticsAlluxio, Inc.
 
Changelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache FlinkChangelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache FlinkFlink Forward
 
The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...
The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...
The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...DataStax
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkFlink Forward
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcachedJurriaan Persyn
 
Eventually, Scylla Chooses Consistency
Eventually, Scylla Chooses ConsistencyEventually, Scylla Chooses Consistency
Eventually, Scylla Chooses ConsistencyScyllaDB
 
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, AdjustShipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, AdjustAltinity Ltd
 
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Flink Forward
 
Building an open data platform with apache iceberg
Building an open data platform with apache icebergBuilding an open data platform with apache iceberg
Building an open data platform with apache icebergAlluxio, Inc.
 
CDC Stream Processing with Apache Flink
CDC Stream Processing with Apache FlinkCDC Stream Processing with Apache Flink
CDC Stream Processing with Apache FlinkTimo Walther
 
What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019
What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019
What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019confluent
 
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...InfluxData
 
How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...DataWorks Summit/Hadoop Summit
 
codecentric AG: CQRS and Event Sourcing Applications with Cassandra
codecentric AG: CQRS and Event Sourcing Applications with Cassandracodecentric AG: CQRS and Event Sourcing Applications with Cassandra
codecentric AG: CQRS and Event Sourcing Applications with CassandraDataStax Academy
 
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021StreamNative
 

What's hot (20)

IoT databases - review and challenges - IoT, Hardware & Robotics meetup - onl...
IoT databases - review and challenges - IoT, Hardware & Robotics meetup - onl...IoT databases - review and challenges - IoT, Hardware & Robotics meetup - onl...
IoT databases - review and challenges - IoT, Hardware & Robotics meetup - onl...
 
Streaming SQL with Apache Calcite
Streaming SQL with Apache CalciteStreaming SQL with Apache Calcite
Streaming SQL with Apache Calcite
 
Scaling for Performance
Scaling for PerformanceScaling for Performance
Scaling for Performance
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm
 
Bucket your partitions wisely - Cassandra summit 2016
Bucket your partitions wisely - Cassandra summit 2016Bucket your partitions wisely - Cassandra summit 2016
Bucket your partitions wisely - Cassandra summit 2016
 
Iceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data AnalyticsIceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data Analytics
 
Changelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache FlinkChangelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache Flink
 
The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...
The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...
The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in Flink
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
 
Eventually, Scylla Chooses Consistency
Eventually, Scylla Chooses ConsistencyEventually, Scylla Chooses Consistency
Eventually, Scylla Chooses Consistency
 
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, AdjustShipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
 
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
 
Building an open data platform with apache iceberg
Building an open data platform with apache icebergBuilding an open data platform with apache iceberg
Building an open data platform with apache iceberg
 
CDC Stream Processing with Apache Flink
CDC Stream Processing with Apache FlinkCDC Stream Processing with Apache Flink
CDC Stream Processing with Apache Flink
 
What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019
What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019
What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019
 
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
 
How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...
 
codecentric AG: CQRS and Event Sourcing Applications with Cassandra
codecentric AG: CQRS and Event Sourcing Applications with Cassandracodecentric AG: CQRS and Event Sourcing Applications with Cassandra
codecentric AG: CQRS and Event Sourcing Applications with Cassandra
 
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
 

Viewers also liked

strangeloop 2012 apache cassandra anti patterns
strangeloop 2012 apache cassandra anti patternsstrangeloop 2012 apache cassandra anti patterns
strangeloop 2012 apache cassandra anti patternsMatthew Dennis
 
Cassandra Anti-Patterns
Cassandra Anti-PatternsCassandra Anti-Patterns
Cassandra Anti-PatternsMatthew Dennis
 
Cassandra introduction apache con 2014 budapest
Cassandra introduction apache con 2014 budapestCassandra introduction apache con 2014 budapest
Cassandra introduction apache con 2014 budapestDuyhai Doan
 
Datastax enterprise presentation
Datastax enterprise presentationDatastax enterprise presentation
Datastax enterprise presentationDuyhai Doan
 
Advanced data modeling with apache cassandra
Advanced data modeling with apache cassandraAdvanced data modeling with apache cassandra
Advanced data modeling with apache cassandraPatrick McFadin
 
Денис Нелюбин, "Тамтэк"
Денис Нелюбин, "Тамтэк"Денис Нелюбин, "Тамтэк"
Денис Нелюбин, "Тамтэк"Ontico
 
Cassandra summit 2013 how not to use cassandra
Cassandra summit 2013  how not to use cassandraCassandra summit 2013  how not to use cassandra
Cassandra summit 2013 how not to use cassandraAxel Liljencrantz
 
Cassandra rapid prototyping with achilles
Cassandra rapid prototyping with achillesCassandra rapid prototyping with achilles
Cassandra rapid prototyping with achillesDuyhai Doan
 
Cassandra java libraries
Cassandra java librariesCassandra java libraries
Cassandra java librariesDuyhai Doan
 
Introduction to HBase - NoSqlNow2015
Introduction to HBase - NoSqlNow2015Introduction to HBase - NoSqlNow2015
Introduction to HBase - NoSqlNow2015Apekshit Sharma
 
Achilles presentation
Achilles presentationAchilles presentation
Achilles presentationDuyhai Doan
 
Cassandra Drivers and Tools
Cassandra Drivers and ToolsCassandra Drivers and Tools
Cassandra Drivers and ToolsDuyhai Doan
 
Cassandra NYC 2011 Data Modeling
Cassandra NYC 2011 Data ModelingCassandra NYC 2011 Data Modeling
Cassandra NYC 2011 Data ModelingMatthew Dennis
 
Effective cassandra development with achilles
Effective cassandra development with achillesEffective cassandra development with achilles
Effective cassandra development with achillesDuyhai Doan
 
Cassandra NodeJS driver & NodeJS Paris
Cassandra NodeJS driver & NodeJS ParisCassandra NodeJS driver & NodeJS Paris
Cassandra NodeJS driver & NodeJS ParisDuyhai Doan
 
DOAN DuyHai – Cassandra: real world best use-cases and worst anti-patterns - ...
DOAN DuyHai – Cassandra: real world best use-cases and worst anti-patterns - ...DOAN DuyHai – Cassandra: real world best use-cases and worst anti-patterns - ...
DOAN DuyHai – Cassandra: real world best use-cases and worst anti-patterns - ...NoSQLmatters
 
From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...
From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...
From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...DataStax
 
DZone Cassandra Data Modeling Webinar
DZone Cassandra Data Modeling WebinarDZone Cassandra Data Modeling Webinar
DZone Cassandra Data Modeling WebinarMatthew Dennis
 

Viewers also liked (20)

strangeloop 2012 apache cassandra anti patterns
strangeloop 2012 apache cassandra anti patternsstrangeloop 2012 apache cassandra anti patterns
strangeloop 2012 apache cassandra anti patterns
 
Cassandra Anti-Patterns
Cassandra Anti-PatternsCassandra Anti-Patterns
Cassandra Anti-Patterns
 
Cassandra introduction apache con 2014 budapest
Cassandra introduction apache con 2014 budapestCassandra introduction apache con 2014 budapest
Cassandra introduction apache con 2014 budapest
 
Datastax enterprise presentation
Datastax enterprise presentationDatastax enterprise presentation
Datastax enterprise presentation
 
Advanced data modeling with apache cassandra
Advanced data modeling with apache cassandraAdvanced data modeling with apache cassandra
Advanced data modeling with apache cassandra
 
Денис Нелюбин, "Тамтэк"
Денис Нелюбин, "Тамтэк"Денис Нелюбин, "Тамтэк"
Денис Нелюбин, "Тамтэк"
 
Introduction to HBase
Introduction to HBaseIntroduction to HBase
Introduction to HBase
 
Cassandra summit 2013 how not to use cassandra
Cassandra summit 2013  how not to use cassandraCassandra summit 2013  how not to use cassandra
Cassandra summit 2013 how not to use cassandra
 
Cassandra rapid prototyping with achilles
Cassandra rapid prototyping with achillesCassandra rapid prototyping with achilles
Cassandra rapid prototyping with achilles
 
Cassandra java libraries
Cassandra java librariesCassandra java libraries
Cassandra java libraries
 
Introduction to HBase - NoSqlNow2015
Introduction to HBase - NoSqlNow2015Introduction to HBase - NoSqlNow2015
Introduction to HBase - NoSqlNow2015
 
Achilles presentation
Achilles presentationAchilles presentation
Achilles presentation
 
Cassandra Drivers and Tools
Cassandra Drivers and ToolsCassandra Drivers and Tools
Cassandra Drivers and Tools
 
Cassandra NYC 2011 Data Modeling
Cassandra NYC 2011 Data ModelingCassandra NYC 2011 Data Modeling
Cassandra NYC 2011 Data Modeling
 
Effective cassandra development with achilles
Effective cassandra development with achillesEffective cassandra development with achilles
Effective cassandra development with achilles
 
Cassandra NodeJS driver & NodeJS Paris
Cassandra NodeJS driver & NodeJS ParisCassandra NodeJS driver & NodeJS Paris
Cassandra NodeJS driver & NodeJS Paris
 
DOAN DuyHai – Cassandra: real world best use-cases and worst anti-patterns - ...
DOAN DuyHai – Cassandra: real world best use-cases and worst anti-patterns - ...DOAN DuyHai – Cassandra: real world best use-cases and worst anti-patterns - ...
DOAN DuyHai – Cassandra: real world best use-cases and worst anti-patterns - ...
 
From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...
From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...
From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...
 
DZone Cassandra Data Modeling Webinar
DZone Cassandra Data Modeling WebinarDZone Cassandra Data Modeling Webinar
DZone Cassandra Data Modeling Webinar
 
Apache Cassandra and Go
Apache Cassandra and GoApache Cassandra and Go
Apache Cassandra and Go
 

Similar to Cassandra nice use cases and worst anti patterns

Cassandra nice use cases and worst anti patterns no sql-matters barcelona
Cassandra nice use cases and worst anti patterns no sql-matters barcelonaCassandra nice use cases and worst anti patterns no sql-matters barcelona
Cassandra nice use cases and worst anti patterns no sql-matters barcelonaDuyhai Doan
 
Introduction to Cassandra & Data model
Introduction to Cassandra & Data modelIntroduction to Cassandra & Data model
Introduction to Cassandra & Data modelDuyhai Doan
 
Cassandra introduction 2016
Cassandra introduction 2016Cassandra introduction 2016
Cassandra introduction 2016Duyhai Doan
 
Cassandra for the ops dos and donts
Cassandra for the ops   dos and dontsCassandra for the ops   dos and donts
Cassandra for the ops dos and dontsDuyhai Doan
 
KillrChat presentation
KillrChat presentationKillrChat presentation
KillrChat presentationDuyhai Doan
 
Cassandra introduction @ NantesJUG
Cassandra introduction @ NantesJUGCassandra introduction @ NantesJUG
Cassandra introduction @ NantesJUGDuyhai Doan
 
Cassandra introduction at FinishJUG
Cassandra introduction at FinishJUGCassandra introduction at FinishJUG
Cassandra introduction at FinishJUGDuyhai Doan
 
Cassandra drivers and libraries
Cassandra drivers and librariesCassandra drivers and libraries
Cassandra drivers and librariesDuyhai Doan
 
Cassandra introduction mars jug
Cassandra introduction mars jugCassandra introduction mars jug
Cassandra introduction mars jugDuyhai Doan
 
Cassandra data structures and algorithms
Cassandra data structures and algorithmsCassandra data structures and algorithms
Cassandra data structures and algorithmsDuyhai Doan
 
Libon cassandra summiteu2014
Libon cassandra summiteu2014Libon cassandra summiteu2014
Libon cassandra summiteu2014Duyhai Doan
 
Cassandra introduction @ ParisJUG
Cassandra introduction @ ParisJUGCassandra introduction @ ParisJUG
Cassandra introduction @ ParisJUGDuyhai Doan
 
KillrChat: Building Your First Application in Apache Cassandra (English)
KillrChat: Building Your First Application in Apache Cassandra (English)KillrChat: Building Your First Application in Apache Cassandra (English)
KillrChat: Building Your First Application in Apache Cassandra (English)DataStax Academy
 
KillrChat Data Modeling
KillrChat Data ModelingKillrChat Data Modeling
KillrChat Data ModelingDuyhai Doan
 
Understanding hd wallets design and implementation
Understanding hd wallets  design and implementationUnderstanding hd wallets  design and implementation
Understanding hd wallets design and implementationArcBlock
 
Sasi, cassandra on the full text search ride At Voxxed Day Belgrade 2016
Sasi, cassandra on the full text search ride At  Voxxed Day Belgrade 2016Sasi, cassandra on the full text search ride At  Voxxed Day Belgrade 2016
Sasi, cassandra on the full text search ride At Voxxed Day Belgrade 2016Duyhai Doan
 
Real data models of silicon valley
Real data models of silicon valleyReal data models of silicon valley
Real data models of silicon valleyPatrick McFadin
 
Cassandra Summit 2014: Real Data Models of Silicon Valley
Cassandra Summit 2014: Real Data Models of Silicon ValleyCassandra Summit 2014: Real Data Models of Silicon Valley
Cassandra Summit 2014: Real Data Models of Silicon ValleyDataStax Academy
 
Sasi, cassandra on full text search ride
Sasi, cassandra on full text search rideSasi, cassandra on full text search ride
Sasi, cassandra on full text search rideDuyhai Doan
 

Similar to Cassandra nice use cases and worst anti patterns (20)

Cassandra nice use cases and worst anti patterns no sql-matters barcelona
Cassandra nice use cases and worst anti patterns no sql-matters barcelonaCassandra nice use cases and worst anti patterns no sql-matters barcelona
Cassandra nice use cases and worst anti patterns no sql-matters barcelona
 
Introduction to Cassandra & Data model
Introduction to Cassandra & Data modelIntroduction to Cassandra & Data model
Introduction to Cassandra & Data model
 
Cassandra introduction 2016
Cassandra introduction 2016Cassandra introduction 2016
Cassandra introduction 2016
 
Cassandra for the ops dos and donts
Cassandra for the ops   dos and dontsCassandra for the ops   dos and donts
Cassandra for the ops dos and donts
 
KillrChat presentation
KillrChat presentationKillrChat presentation
KillrChat presentation
 
Cassandra introduction @ NantesJUG
Cassandra introduction @ NantesJUGCassandra introduction @ NantesJUG
Cassandra introduction @ NantesJUG
 
Cassandra introduction at FinishJUG
Cassandra introduction at FinishJUGCassandra introduction at FinishJUG
Cassandra introduction at FinishJUG
 
Cassandra drivers and libraries
Cassandra drivers and librariesCassandra drivers and libraries
Cassandra drivers and libraries
 
Cassandra introduction mars jug
Cassandra introduction mars jugCassandra introduction mars jug
Cassandra introduction mars jug
 
Cassandra data structures and algorithms
Cassandra data structures and algorithmsCassandra data structures and algorithms
Cassandra data structures and algorithms
 
Libon cassandra summiteu2014
Libon cassandra summiteu2014Libon cassandra summiteu2014
Libon cassandra summiteu2014
 
Cassandra introduction @ ParisJUG
Cassandra introduction @ ParisJUGCassandra introduction @ ParisJUG
Cassandra introduction @ ParisJUG
 
KillrChat: Building Your First Application in Apache Cassandra (English)
KillrChat: Building Your First Application in Apache Cassandra (English)KillrChat: Building Your First Application in Apache Cassandra (English)
KillrChat: Building Your First Application in Apache Cassandra (English)
 
KillrChat Data Modeling
KillrChat Data ModelingKillrChat Data Modeling
KillrChat Data Modeling
 
Understanding hd wallets design and implementation
Understanding hd wallets  design and implementationUnderstanding hd wallets  design and implementation
Understanding hd wallets design and implementation
 
Sasi, cassandra on the full text search ride At Voxxed Day Belgrade 2016
Sasi, cassandra on the full text search ride At  Voxxed Day Belgrade 2016Sasi, cassandra on the full text search ride At  Voxxed Day Belgrade 2016
Sasi, cassandra on the full text search ride At Voxxed Day Belgrade 2016
 
Real data models of silicon valley
Real data models of silicon valleyReal data models of silicon valley
Real data models of silicon valley
 
Cassandra Summit 2014: Real Data Models of Silicon Valley
Cassandra Summit 2014: Real Data Models of Silicon ValleyCassandra Summit 2014: Real Data Models of Silicon Valley
Cassandra Summit 2014: Real Data Models of Silicon Valley
 
Apache Cassandra & Data Modeling
Apache Cassandra & Data ModelingApache Cassandra & Data Modeling
Apache Cassandra & Data Modeling
 
Sasi, cassandra on full text search ride
Sasi, cassandra on full text search rideSasi, cassandra on full text search ride
Sasi, cassandra on full text search ride
 

More from Duyhai Doan

Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...
Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...
Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...Duyhai Doan
 
Le futur d'apache cassandra
Le futur d'apache cassandraLe futur d'apache cassandra
Le futur d'apache cassandraDuyhai Doan
 
Big data 101 for beginners devoxxpl
Big data 101 for beginners devoxxplBig data 101 for beginners devoxxpl
Big data 101 for beginners devoxxplDuyhai Doan
 
Big data 101 for beginners riga dev days
Big data 101 for beginners riga dev daysBig data 101 for beginners riga dev days
Big data 101 for beginners riga dev daysDuyhai Doan
 
Datastax day 2016 introduction to apache cassandra
Datastax day 2016   introduction to apache cassandraDatastax day 2016   introduction to apache cassandra
Datastax day 2016 introduction to apache cassandraDuyhai Doan
 
Datastax day 2016 : Cassandra data modeling basics
Datastax day 2016 : Cassandra data modeling basicsDatastax day 2016 : Cassandra data modeling basics
Datastax day 2016 : Cassandra data modeling basicsDuyhai Doan
 
Apache cassandra in 2016
Apache cassandra in 2016Apache cassandra in 2016
Apache cassandra in 2016Duyhai Doan
 
Spark zeppelin-cassandra at synchrotron
Spark zeppelin-cassandra at synchrotronSpark zeppelin-cassandra at synchrotron
Spark zeppelin-cassandra at synchrotronDuyhai Doan
 
Cassandra 3 new features @ Geecon Krakow 2016
Cassandra 3 new features  @ Geecon Krakow 2016Cassandra 3 new features  @ Geecon Krakow 2016
Cassandra 3 new features @ Geecon Krakow 2016Duyhai Doan
 
Algorithme distribués pour big data saison 2 @DevoxxFR 2016
Algorithme distribués pour big data saison 2 @DevoxxFR 2016Algorithme distribués pour big data saison 2 @DevoxxFR 2016
Algorithme distribués pour big data saison 2 @DevoxxFR 2016Duyhai Doan
 
Apache Zeppelin @DevoxxFR 2016
Apache Zeppelin @DevoxxFR 2016Apache Zeppelin @DevoxxFR 2016
Apache Zeppelin @DevoxxFR 2016Duyhai Doan
 
Cassandra 3 new features 2016
Cassandra 3 new features 2016Cassandra 3 new features 2016
Cassandra 3 new features 2016Duyhai Doan
 
Cassandra introduction 2016
Cassandra introduction 2016Cassandra introduction 2016
Cassandra introduction 2016Duyhai Doan
 
Spark cassandra integration 2016
Spark cassandra integration 2016Spark cassandra integration 2016
Spark cassandra integration 2016Duyhai Doan
 
Spark Cassandra 2016
Spark Cassandra 2016Spark Cassandra 2016
Spark Cassandra 2016Duyhai Doan
 
Apache zeppelin the missing component for the big data ecosystem
Apache zeppelin the missing component for the big data ecosystemApache zeppelin the missing component for the big data ecosystem
Apache zeppelin the missing component for the big data ecosystemDuyhai Doan
 
Cassandra UDF and Materialized Views
Cassandra UDF and Materialized ViewsCassandra UDF and Materialized Views
Cassandra UDF and Materialized ViewsDuyhai Doan
 
Data stax academy
Data stax academyData stax academy
Data stax academyDuyhai Doan
 
Apache zeppelin, the missing component for the big data ecosystem
Apache zeppelin, the missing component for the big data ecosystemApache zeppelin, the missing component for the big data ecosystem
Apache zeppelin, the missing component for the big data ecosystemDuyhai Doan
 
Cassandra and Spark, closing the gap between no sql and analytics codemotio...
Cassandra and Spark, closing the gap between no sql and analytics   codemotio...Cassandra and Spark, closing the gap between no sql and analytics   codemotio...
Cassandra and Spark, closing the gap between no sql and analytics codemotio...Duyhai Doan
 

More from Duyhai Doan (20)

Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...
Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...
Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...
 
Le futur d'apache cassandra
Le futur d'apache cassandraLe futur d'apache cassandra
Le futur d'apache cassandra
 
Big data 101 for beginners devoxxpl
Big data 101 for beginners devoxxplBig data 101 for beginners devoxxpl
Big data 101 for beginners devoxxpl
 
Big data 101 for beginners riga dev days
Big data 101 for beginners riga dev daysBig data 101 for beginners riga dev days
Big data 101 for beginners riga dev days
 
Datastax day 2016 introduction to apache cassandra
Datastax day 2016   introduction to apache cassandraDatastax day 2016   introduction to apache cassandra
Datastax day 2016 introduction to apache cassandra
 
Datastax day 2016 : Cassandra data modeling basics
Datastax day 2016 : Cassandra data modeling basicsDatastax day 2016 : Cassandra data modeling basics
Datastax day 2016 : Cassandra data modeling basics
 
Apache cassandra in 2016
Apache cassandra in 2016Apache cassandra in 2016
Apache cassandra in 2016
 
Spark zeppelin-cassandra at synchrotron
Spark zeppelin-cassandra at synchrotronSpark zeppelin-cassandra at synchrotron
Spark zeppelin-cassandra at synchrotron
 
Cassandra 3 new features @ Geecon Krakow 2016
Cassandra 3 new features  @ Geecon Krakow 2016Cassandra 3 new features  @ Geecon Krakow 2016
Cassandra 3 new features @ Geecon Krakow 2016
 
Algorithme distribués pour big data saison 2 @DevoxxFR 2016
Algorithme distribués pour big data saison 2 @DevoxxFR 2016Algorithme distribués pour big data saison 2 @DevoxxFR 2016
Algorithme distribués pour big data saison 2 @DevoxxFR 2016
 
Apache Zeppelin @DevoxxFR 2016
Apache Zeppelin @DevoxxFR 2016Apache Zeppelin @DevoxxFR 2016
Apache Zeppelin @DevoxxFR 2016
 
Cassandra 3 new features 2016
Cassandra 3 new features 2016Cassandra 3 new features 2016
Cassandra 3 new features 2016
 
Cassandra introduction 2016
Cassandra introduction 2016Cassandra introduction 2016
Cassandra introduction 2016
 
Spark cassandra integration 2016
Spark cassandra integration 2016Spark cassandra integration 2016
Spark cassandra integration 2016
 
Spark Cassandra 2016
Spark Cassandra 2016Spark Cassandra 2016
Spark Cassandra 2016
 
Apache zeppelin the missing component for the big data ecosystem
Apache zeppelin the missing component for the big data ecosystemApache zeppelin the missing component for the big data ecosystem
Apache zeppelin the missing component for the big data ecosystem
 
Cassandra UDF and Materialized Views
Cassandra UDF and Materialized ViewsCassandra UDF and Materialized Views
Cassandra UDF and Materialized Views
 
Data stax academy
Data stax academyData stax academy
Data stax academy
 
Apache zeppelin, the missing component for the big data ecosystem
Apache zeppelin, the missing component for the big data ecosystemApache zeppelin, the missing component for the big data ecosystem
Apache zeppelin, the missing component for the big data ecosystem
 
Cassandra and Spark, closing the gap between no sql and analytics codemotio...
Cassandra and Spark, closing the gap between no sql and analytics   codemotio...Cassandra and Spark, closing the gap between no sql and analytics   codemotio...
Cassandra and Spark, closing the gap between no sql and analytics codemotio...
 

Recently uploaded

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 

Recently uploaded (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 

Cassandra nice use cases and worst anti patterns

  • 1. Cassandra nice use-cases and worst anti-patterns DuyHai DOAN, Technical Advocate @doanduyhai
  • 2. Agenda! @doanduyhai 2 Anti-patterns • Queue-like designs • CQL null values • Intensive update on same column • Design around dynamic schema
  • 3. Agenda! @doanduyhai 3 Nice use-cases • Rate-limiting • Anti Fraud • Account validation • Sensor data timeseries
  • 4. Worst anti-patterns! Queue-like designs! CQL null! Intensive update on same column! Design around dynamic schema! !
  • 5. Failure level! @doanduyhai 5 ☠ ☠☠ ☠☠☠ ☠☠☠☠
  • 6. Queue-like designs! @doanduyhai 6 Adding new message ☞ 1 physical insert
  • 7. Queue-like designs! @doanduyhai 7 Adding new message ☞ 1 physical insert Consuming message = deleting it ☞ 1 physical insert (tombstone)
  • 8. Queue-like designs! @doanduyhai 8 Adding new message ☞ 1 physical insert Consuming message = deleting it ☞ 1 physical insert (tombstone) Transactional queue = re-inserting messages ☞ physical insert * <many>
  • 9. Queue-like designs! FIFO queue @doanduyhai 9 A { A }
  • 10. Queue-like designs! FIFO queue @doanduyhai 10 A B { A, B }
  • 11. Queue-like designs! FIFO queue @doanduyhai 11 A B C { A, B, C }
  • 12. Queue-like designs! FIFO queue @doanduyhai 12 A B C A { B, C }
  • 13. Queue-like designs! FIFO queue @doanduyhai 13 A B C A D { B, C, D }
  • 14. Queue-like designs! FIFO queue @doanduyhai 14 A B C A D B { C, D }
  • 15. Queue-like designs! FIFO queue @doanduyhai 15 A B C A D B C { D }
  • 16. Queue-like designs! FIFO queue, worst case @doanduyhai 16 A A A A A A A A A A { }
  • 18. CQL null semantics! @doanduyhai 18 Reading null value means • value does not exist (has never bean created) • value deleted (tombstone) SELECT age FROM users WHERE login = ddoan; à NULL
  • 19. CQL null semantics! @doanduyhai 19 Writing null means • delete value (creating tombstone) • even though it does not exist UPDATE users SET age = NULL WHERE login = ddoan;
  • 20. CQL null semantics! @doanduyhai 20 Seen in production: prepared statement UPDATE users SET age = ?, … geo_location = ?, mood = ?, … WHERE login = ?;
  • 21. CQL null semantics! @doanduyhai 21 Seen in production: bound statement preparedStatement.bind(33, …, null, null, null, …); null ☞ tombstone creation on each update … jdoe age name geo_loc mood status 33 John DOE ý ý ý
  • 23. Intensive update! @doanduyhai 23 Context • small start-up • cloud-based video recording & alarm • internet of things (sensor) • 10 updates/sec for some sensors
  • 24. Intensive update on same column! @doanduyhai 24 Data model sensor_id value 45.0034 CREATE TABLE sensor_data ( sensor_id long, value double, PRIMARY KEY(sensor_id));
  • 25. Intensive update on same column! UPDATE sensor_data SET value = 45.0034 WHERE sensor_id = …; UPDATE sensor_data SET value = 47.4182 WHERE sensor_id = …; UPDATE sensor_data SET value = 48.0300 WHERE sensor_id = …; @doanduyhai 25 Updates sensor_id value (t1) 45.0034 sensor_id value (t13) 47.4182 sensor_id value (t36) 48.0300
  • 26. Intensive update on same column! @doanduyhai 26 Read SELECT sensor_value from sensor_data WHERE sensor_id = …; read N physical columns, only 1 useful … sensor_id value (t1) 45.0034 sensor_id value (t13) 47.4182 sensor_id value (t36) 48.0300
  • 27. Intensive update on same column! @doanduyhai 27 Solution 1: leveled compaction! (if your I/O can keep up) sensor_id value (t1) 45.0034 sensor_id value (t13) 47.4182 sensor_id value (t36) 48.0300 sensor_id value (t36) 48.0300
  • 28. Intensive update on same column! @doanduyhai 28 Solution 2: reversed timeseries & DateTiered compaction strategy CREATE TABLE sensor_data ( sensor_id long, date timestamp, sensor_value double, PRIMARY KEY((sensor_id), date)) WITH CLUSTERING ORDER (date DESC);
  • 29. Intensive update on same column! SELECT sensor_value FROM sensor_data WHERE sensor_id = … LIMIT 1; @doanduyhai 29 sensor_id date3(t3) date2(t2) date1(t1) Data cleaning by configuration (max_sstable_age_days) ... 48.0300 47.4182 45.0034 …
  • 31. Design around dynamic schema! @doanduyhai 31 Customer emergency call • 3 nodes cluster almost full • impossible to scale out • 4th node in JOINING state for 1 week • disk space is filling up, production at risk!
  • 32. Design around dynamic schema! @doanduyhai 32 After investigation • 4th node in JOINING state because streaming is stalled • NPE in logs
  • 33. Design around dynamic schema! @doanduyhai 33 After investigation • 4th node in JOINING state because streaming is stalled • NPE in logs Cassandra source-code to the rescue
  • 34. Design around dynamic schema! @doanduyhai 34 public class CompressedStreamReader extends StreamReader { … @Override public SSTableWriter read(ReadableByteChannel channel) throws IOException { … Pair<String, String> kscf = Schema.instance.getCF(cfId); ColumnFamilyStore cfs = Keyspace.open(kscf.left).getColumnFamilyStore(kscf.right); NPE here
  • 35. Design around dynamic schema! @doanduyhai 35 The truth is • the devs dynamically drop & recreate table every day • dynamic schema is in the core of their design Example: DROP TABLE catalog_127_20140613; CREATE TABLE catalog_127_20140614( … );
  • 36. Design around dynamic schema! @doanduyhai 36 Failure sequence n1 n2 n4 n3 catalog_x_y catalog_x_y catalog_x_y catalog_x_y 4 1 2 3 5 6
  • 37. Design around dynamic schema! @doanduyhai 37 Failure sequence n1 n2 n4 n3 catalog_x_y catalog_x_y catalog_x_y catalog_x_y 4 1 2 3 5 6 catalog_x_z catalog_x_z catalog_x_z catalog_x_z
  • 38. Design around dynamic schema! @doanduyhai catalog_x_y ???? 38 Failure sequence n1 n2 n4 n3 4 1 2 3 5 6 catalog_x_z catalog_x_z catalog_x_z catalog_x_z
  • 39. Design around dynamic schema! @doanduyhai 39 Consequences • joining node got always stuck • à cannot extend cluster • à changing code takes time • à production in danger (no space left) • à sacrify analytics data to survive
  • 40. Design around dynamic schema! @doanduyhai 40 Nutshell • dynamic schema change as normal operations is not recommended • concurrent schema AND topology change is an anti-pattern
  • 41. Failure level! @doanduyhai 41 ☠☠☠☠
  • 42. ! " ! Q & R
  • 43. Nice Examples! Rate limiting! Anti Fraud! Account Validation! Sensor Data Timeseries!
  • 44. Rate limiting! @doanduyhai 44 Start-up company, reset password feature 1) /password/reset 2) SMS with token A0F83E63DB935465CE73DFE…. Phone number Random token 3) /password/new/<token>/<password>
  • 45. Rate limiting! @doanduyhai 45 Problem 1 • account created with premium phone number
  • 46. Rate limiting! @doanduyhai 46 Problem 1 • account created with premium phone number • /password/reset x 100
  • 47. Rate limiting! @doanduyhai 47 « money, money, money, give money, in the richman’s world » $$$
  • 48. Rate limiting! @doanduyhai 48 Problem 2 • massive hack
  • 49. Rate limiting! @doanduyhai 49 Problem 2 • massive hack • 106 /password/reset calls from few accounts
  • 50. Rate limiting! @doanduyhai 50 Problem 2 • massive hack • 106 /password/reset calls from few accounts • SMS messages are cheap
  • 51. Rate limiting! @doanduyhai 51 Problem 2 • ☞ but not at the 106/per user/per day scale
  • 52. Rate limiting! @doanduyhai 52 Solution • premium phone number ☞ Google libphonenumber
  • 53. Rate limiting! @doanduyhai 53 Solution • premium phone number ☞ Google libphonenumber • massive hack ☞ rate limiting with Cassandra
  • 54. Cassandra Time To Live! @doanduyhai 54 Time to live • built-in feature • insert data with a TTL in sec • expires server-side automatically • ☞ use as sliding-window
  • 55. Rate limiting in action! @doanduyhai 55 Implementation • threshold = max 3 reset password per sliding 24h
  • 56. Rate limiting in action! @doanduyhai 56 Implementation • when /password/reset called • check threshold • reached ☞ error message/ignore • not reached ☞ log the attempt with TTL = 86400
  • 58. Anti Fraud! @doanduyhai 58 Real story • many special offers available • 30 mins international calls (50 countries) • unlimited land-line calls to 5 countries • …
  • 59. Anti Fraud! @doanduyhai 59 Real story • each offer has a duration (week/month/year) • only one offer active at a time
  • 60. Anti Fraud! @doanduyhai 60 Cassandra TTL • check for existing offer before SELECT count(*) FROM user_special_offer WHERE login = ‘jdoe’;
  • 61. Anti Fraud! @doanduyhai 61 Cassandra TTL • then grant new offer INSERT INTO user_special_offer(login, offer_code, …) VALUES(‘jdoe’, ’30_mins_international’,…) USING TTL <offer_duration>;
  • 62. Account Validation! @doanduyhai 62 Requirement • user creates new account • sends sms/email link with token to validate account • 10 days to validate
  • 63. Account Validation! @doanduyhai 63 How to ? • create account with 10 days TTL INSERT INTO users(login, name, age) VALUES(‘jdoe’, ‘John DOE’, 33) USING TTL 864000;
  • 64. Account Validation! @doanduyhai 64 How to ? • create random token for validation with 10 days TTL INSERT INTO account_validation(token, login, name, age) VALUES(‘A0F83E63DB935465CE73DFE…’, ‘jdoe’, ‘John DOE’, 33) USING TTL 864000;
  • 65. Account Validation! @doanduyhai 65 On token validation • check token exist & retrieve user details SELECT login, name, age FROM account_validation WHERE token = ‘A0F83E63DB935465CE73DFE…’; • re-insert durably user details without TTL INSERT INTO users(login, name, age) VALUES(‘jdoe’, ‘John DOE’, 33);
  • 66. Sensor Data Timeseries! @doanduyhai 66 Requirements • lots of sensors (103 – 106) • medium to high insertion rate (0.1 – 10/secs) • keep good load balancing • fast read & write
  • 67. Bucketing! @doanduyhai 67 CREATE TABLE sensor_data ( sensor_id text, date timestamp, raw_data blob, PRIMARY KEY(sensor_id, date)); sensor_id date1 date2 date3 date4 … blob1 blob2 blob3 blob4 …
  • 68. Bucketing! @doanduyhai 68 Problems: • limit of 2.109 physical columns • bad load balancing (1 sensor = 1 node) • wide row spans over many files sensor_id date1 date2 date3 date4 … blob1 blob2 blob3 blob4 …
  • 69. Bucketing! @doanduyhai 69 Idea: • composite partition key: sensor_id:date_bucket • tunable date granularity: per hour/per day/per month … CREATE TABLE sensor_data ( sensor_id text, date_bucket int, //format YYYYMMdd date timestamp, raw_data blob, PRIMARY KEY((sensor_id, date_bucket), date));
  • 70. Bucketing! Idea: • composite partition key: sensor_id:date_bucket • tunable date granularity: per hour/per day/per month … @doanduyhai 70 sensor_id:2014091014 date1 date2 date3 date4 … blob1 blob2 blob3 blob4 … sensor_id:2014091015 date11 date12 date13 date14 … blob11 blob12 blob13 blob14 … Buckets
  • 71. Bucketing! @doanduyhai 71 Advantage: • distribute load: 1 bucket = 1 node • limit partition width (max x columns per bucket) Buckets sensor_id:2014091014 date1 date2 date3 date4 … blob1 blob2 blob3 blob4 … sensor_id:2014091015 date11 date12 date13 date14 … blob11 blob12 blob13 blob14 …
  • 72. Bucketing! @doanduyhai 72 But how can I select raw data between 14:45 and 15:10 ? 14:45 à ? 15:00 à 15:10 sensor_id:2014091014 date1 date2 date3 date4 … blob1 blob2 blob3 blob4 … sensor_id:2014091015 date11 date12 date13 date14 … blob11 blob12 blob13 blob14 …
  • 73. Bucketing! Solution • use IN clause on partition key component • with range condition on date column ☞ date column should be monotonic function (increasing/decreasing) @doanduyhai 73 SELECT * FROM sensor_data WHERE sensor_id = xxx AND date_bucket IN (2014091014 , 2014091015) AND date >= ‘2014-09-10 14:45:00.000‘ AND date <= ‘2014-09-10 15:10:00.000‘
  • 74. Bucketing Caveats! @doanduyhai 74 IN clause for #partition is not silver bullet ! • use scarcely • keep cardinality low (≤ 5) n1 n2 n3 n4 n5 n6 n7 coordinator n8 sensor_id:2014091014 sensor_id:2014091015
  • 75. Bucketing Caveats! @doanduyhai 75 IN clause for #partition is not silver bullet ! • use scarcely • keep cardinality low (≤ 5) • prefer // async queries • ease of query vs perf n1 n2 n3 n4 n5 n6 n7 n8 Async client sensor_id:2014091014 sensor_id:2014091015
  • 76. ! " ! Q & R
  • 77. Cassandra developers! @doanduyhai 77 Rule n°1 If you don’t know, ask for help (me, Cassandra ML, PlanetCassandra, stackoverflow, …) !
  • 78. Cassandra developers! @doanduyhai 78 Rule n°2 Do not blind-guess troubleshooting alone in production (ask for help, see rule n°1) !
  • 79. Cassandra developers! @doanduyhai 79 Rule n°3 Share with the community (your best use-cases … and worst failures) ! http://planetcassandra.org/
  • 80. Thank You @doanduyhai duy_hai.doan@datastax.com