SlideShare uma empresa Scribd logo
1 de 29
Baixar para ler offline
@PatrickMcFadin
Patrick McFadin

Chief Evangelist, DataStax
Cassandra 3.0 Data Modeling
1
A brief history of CQL
You
CQL 3.0 - Cassandra 1.2
• Goodbye CQL 2.0!
• Custom secondary indexes
• Empty IN
CQL 3.1 - Cassandra 2.0
• Aliases
• CREATE <table> IF NOT EXISTS
• INSERT IF NOT EXISTS
• UPDATE IF
• DELETE IF EXISTS
• IN supports cluster columns
LWT
CQL 3.2 - Cassandra 2.1
• User Defined Types
• Collection Indexing
• Indexes can use contains
• Tuples?
User Defined Types
CREATE TYPE video_metadata (
height int,
width int,
video_bit_rate set<text>,
encoding text
);
User Defined Types
CREATE TABLE videos (
videoid uuid,
userid uuid,
name varchar,
description varchar,
location text,
location_type int,
preview_thumbnails map<text,text>,
tags set<varchar>,
metadata set <frozen<video_metadata>>,
added_date timestamp,
PRIMARY KEY (videoid)
);
CQL 3.3 - Cassandra 2.2
• Date and Time are now types
• TinyInt and SmallInt
• User Defined Functions
• Aggregates
• User Defined Aggregates
User Defined Functions
CREATE TABLE video_rating (
videoid uuid,
rating_counter counter,
rating_total counter,
PRIMARY KEY (videoid)
);
CREATE OR REPLACE FUNCTION
avg_rating (rating_counter counter, rating_total counter)
CALLED ON NULL INPUT
RETURNS double
LANGUAGE java AS
'return Double.valueOf(rating_total.doubleValue()/
rating_counter.doubleValue());';
User Defined Functions
SELECT avg_rating(rating_counter, rating_total) AS avg_rating
FROM video_rating
WHERE videoid = 99051fe9-6a9c-46c2-b949-38ef78858dd0;
Aggregates
CREATE TABLE video_ratings_by_user (
videoid uuid,
userid uuid,
rating int,
PRIMARY KEY (videoid, userid)
);
SELECT count(userid)
FROM video_ratings_by_user
WHERE videoed = 49f64d40-7d89-4890-b910-dbf923563a33
CQL 3.4 - Cassandra 3.x
• CAST operator
• Per Partition Limit
• Materialized Views
• SASI
Materialized View
CREATE TABLE videos (
videoid uuid,
userid uuid,
name varchar,
description varchar,
location text,
location_type int,
preview_thumbnails map<text,text>,
tags set<varchar>,
metadata set <frozen<video_metadata>>,
added_date timestamp,
PRIMARY KEY (videoid)
);
Lookup by this?
Materialized View
CREATE TABLE videos_by_location (
videoid uuid,
userid uuid,
location text,
added_date timestamp,
PRIMARY KEY (location, videoid)
);
Roll your own
Materialized View
CREATE MATERIALIZED VIEW videos_by_location
AS SELECT userid, added_date, videoid, location
FROM videos
WHERE videoId IS NOT NULL AND location IS NOT NULL
PRIMARY KEY(location, videoid);
Cassandra rolls for you
Materialized View Perf
Materialized View Perf
5 Materialized Views vs 5 tables writes async
Materialized View
SELECT location, videoid
FROM videos_by_location ;
location | videoid
-------------------------------------------------+--------------------------------------
http://www.youtube.com/watch?v=px6U2n74q3g | 06049cbb-dfed-421f-b889-5f649a0de1ed
http://www.youtube.com/watch?v=qphhxujn5Es | 873ff430-9c23-4e60-be5f-278ea2bb21bd
/us/vid/0c/0c3f7e87-f6b6-41d2-9668-2b64d117102c | 0c3f7e87-f6b6-41d2-9668-2b64d117102c
/us/vid/b3/b3a76c6b-7c7f-4af6-964f-803a9283c401 | 99051fe9-6a9c-46c2-b949-38ef78858dd0
/us/vid/b3/b3a76c6b-7c7f-4af6-964f-803a9283c401 | b3a76c6b-7c7f-4af6-964f-803a9283c401
http://www.youtube.com/watch?v=HdJlsOZVGwM | 49f64d40-7d89-4890-b910-dbf923563a33
/us/vid/41/416a5ddc-00a5-49ed-adde-d99da9a27c0c | 416a5ddc-00a5-49ed-adde-d99da9a27c0c
SASI
CREATE TABLE users (
userid uuid,
firstname varchar,
lastname varchar,
email text,
created_date timestamp,
PRIMARY KEY (userid)
);
Lookup by this?
Storage Attached Secondary Index
SASI
SASI
CREATE CUSTOM INDEX ON users (firstname)
USING 'org.apache.cassandra.index.sasi.SASIIndex'
WITH OPTIONS = {
'analyzer_class':
'org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer',
'case_sensitive': 'false'
};
SASI
CREATE CUSTOM INDEX ON users (lastname)
USING 'org.apache.cassandra.index.sasi.SASIIndex'
WITH OPTIONS = {'mode': 'CONTAINS'};
SASI
CREATE CUSTOM INDEX ON users (created_date)
USING 'org.apache.cassandra.index.sasi.SASIIndex'
WITH OPTIONS = {'mode': 'SPARSE'};
SASI Indexes
Client
INSERT INTO users(userid,firstname,lastname,email,created_date)

VALUES (9761d3d7-7fbd-4269-9988-6cfd4e188678,’Patrick’,’McFadin’,
’patrick@datastax.com’,’2015-06-01’);
userid 1
userid 2
Memtable
SSTable
SSTable
SSTable
SASI Index
Node
Data
lastname
lastname
firstname
firstname
email
email
created_date
created_date
SASI Index
SASI Index
Indexer
SASI Queries
SELECT * FROM users WHERE firstname LIKE 'pat%';
SELECT * FROM users WHERE lastname LIKE ‘%Fad%';
SELECT * FROM users WHERE email LIKE '%data%';
SELECT * FROM users
WHERE created_date > '2011-6-15'
AND created_date < '2011-06-30';
userid | created_date | email | firstname | lastname
--------------------------------------+---------------------------------+----------------------+-----------+----------
9761d3d7-7fbd-4269-9988-6cfd4e188678 | 2011-06-20 20:50:00.000000+0000 | patrick@datastax.com | Patrick | McFadin
SASI Guidelines
• Multiple fields to search
• No more than 1000 rows returned
• You know the partition key
• Indexing static columns
Use SASI when…
SASI Guidelines
• Searching large partitions
• Tight SLA on reads
• Search for analytics
• Ordering search is important
Don’t Use SASI when…
Cassandra 3.0 Data Modeling

Mais conteúdo relacionado

Mais procurados

The world's next top data model
The world's next top data modelThe world's next top data model
The world's next top data model
Patrick McFadin
 
Cassandra Summit 2013 Keynote
Cassandra Summit 2013 KeynoteCassandra Summit 2013 Keynote
Cassandra Summit 2013 Keynote
jbellis
 

Mais procurados (20)

Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and Drivers
 
Cassandra 3.0
Cassandra 3.0Cassandra 3.0
Cassandra 3.0
 
Cassandra 2.0 better, faster, stronger
Cassandra 2.0   better, faster, strongerCassandra 2.0   better, faster, stronger
Cassandra 2.0 better, faster, stronger
 
Advanced data modeling with apache cassandra
Advanced data modeling with apache cassandraAdvanced data modeling with apache cassandra
Advanced data modeling with apache cassandra
 
Cassandra EU - Data model on fire
Cassandra EU - Data model on fireCassandra EU - Data model on fire
Cassandra EU - Data model on fire
 
The world's next top data model
The world's next top data modelThe world's next top data model
The world's next top data model
 
Cassandra Summit 2013 Keynote
Cassandra Summit 2013 KeynoteCassandra Summit 2013 Keynote
Cassandra Summit 2013 Keynote
 
DataStax NYC Java Meetup: Cassandra with Java
DataStax NYC Java Meetup: Cassandra with JavaDataStax NYC Java Meetup: Cassandra with Java
DataStax NYC Java Meetup: Cassandra with Java
 
Cassandra 3.0 - JSON at scale - StampedeCon 2015
Cassandra 3.0 - JSON at scale - StampedeCon 2015Cassandra 3.0 - JSON at scale - StampedeCon 2015
Cassandra 3.0 - JSON at scale - StampedeCon 2015
 
Cassandra 2.2 & 3.0
Cassandra 2.2 & 3.0Cassandra 2.2 & 3.0
Cassandra 2.2 & 3.0
 
Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...
Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...
Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...
 
Storing time series data with Apache Cassandra
Storing time series data with Apache CassandraStoring time series data with Apache Cassandra
Storing time series data with Apache Cassandra
 
Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...
Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...
Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...
 
Time series with apache cassandra strata
Time series with apache cassandra   strataTime series with apache cassandra   strata
Time series with apache cassandra strata
 
DataStax: An Introduction to DataStax Enterprise Search
DataStax: An Introduction to DataStax Enterprise SearchDataStax: An Introduction to DataStax Enterprise Search
DataStax: An Introduction to DataStax Enterprise Search
 
Introduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandraIntroduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandra
 
Cassandra Materialized Views
Cassandra Materialized ViewsCassandra Materialized Views
Cassandra Materialized Views
 
Apache Cassandra & Data Modeling
Apache Cassandra & Data ModelingApache Cassandra & Data Modeling
Apache Cassandra & Data Modeling
 
Cassandra nice use cases and worst anti patterns
Cassandra nice use cases and worst anti patternsCassandra nice use cases and worst anti patterns
Cassandra nice use cases and worst anti patterns
 
Cassandra 2.0 and timeseries
Cassandra 2.0 and timeseriesCassandra 2.0 and timeseries
Cassandra 2.0 and timeseries
 

Semelhante a Cassandra 3.0 Data Modeling

My sql 5.7-upcoming-changes-v2
My sql 5.7-upcoming-changes-v2My sql 5.7-upcoming-changes-v2
My sql 5.7-upcoming-changes-v2
Morgan Tocker
 

Semelhante a Cassandra 3.0 Data Modeling (20)

KillrVideo: Data Modeling Evolved (Patrick McFadin, Datastax) | Cassandra Sum...
KillrVideo: Data Modeling Evolved (Patrick McFadin, Datastax) | Cassandra Sum...KillrVideo: Data Modeling Evolved (Patrick McFadin, Datastax) | Cassandra Sum...
KillrVideo: Data Modeling Evolved (Patrick McFadin, Datastax) | Cassandra Sum...
 
Hey Relational Developer, Let's Go Crazy (Patrick McFadin, DataStax) | Cassan...
Hey Relational Developer, Let's Go Crazy (Patrick McFadin, DataStax) | Cassan...Hey Relational Developer, Let's Go Crazy (Patrick McFadin, DataStax) | Cassan...
Hey Relational Developer, Let's Go Crazy (Patrick McFadin, DataStax) | Cassan...
 
Oracle to Cassandra Core Concepts Guide Pt. 2
Oracle to Cassandra Core Concepts Guide Pt. 2Oracle to Cassandra Core Concepts Guide Pt. 2
Oracle to Cassandra Core Concepts Guide Pt. 2
 
Cassandra Summit 2014: Real Data Models of Silicon Valley
Cassandra Summit 2014: Real Data Models of Silicon ValleyCassandra Summit 2014: Real Data Models of Silicon Valley
Cassandra Summit 2014: Real Data Models of Silicon Valley
 
Why you'll love Windows Azure SDK 2.0
Why you'll love Windows Azure SDK 2.0Why you'll love Windows Azure SDK 2.0
Why you'll love Windows Azure SDK 2.0
 
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetchDataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
 
Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...
Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...
Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...
 
서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트
 
High available BizTalk infrastructure on Azure IaaS
High available BizTalk infrastructure on Azure IaaSHigh available BizTalk infrastructure on Azure IaaS
High available BizTalk infrastructure on Azure IaaS
 
Cassandra Summit 2014: Highly Scalable Web Application in the Cloud with Cass...
Cassandra Summit 2014: Highly Scalable Web Application in the Cloud with Cass...Cassandra Summit 2014: Highly Scalable Web Application in the Cloud with Cass...
Cassandra Summit 2014: Highly Scalable Web Application in the Cloud with Cass...
 
Windows Azure: Lessons From the Field
Windows Azure: Lessons From the FieldWindows Azure: Lessons From the Field
Windows Azure: Lessons From the Field
 
Введение в современную PostgreSQL. Часть 2
Введение в современную PostgreSQL. Часть 2Введение в современную PostgreSQL. Часть 2
Введение в современную PostgreSQL. Часть 2
 
Nagios Conference 2014 - Jeff Mendoza - Monitoring Microsoft Azure with Nagios
Nagios Conference 2014 - Jeff Mendoza - Monitoring Microsoft Azure with NagiosNagios Conference 2014 - Jeff Mendoza - Monitoring Microsoft Azure with Nagios
Nagios Conference 2014 - Jeff Mendoza - Monitoring Microsoft Azure with Nagios
 
How to Avoid Pitfalls in Schema Upgrade with Galera
How to Avoid Pitfalls in Schema Upgrade with GaleraHow to Avoid Pitfalls in Schema Upgrade with Galera
How to Avoid Pitfalls in Schema Upgrade with Galera
 
Time series Analytics - a deep dive into ADX Azure Data Explorer @Data Saturd...
Time series Analytics - a deep dive into ADX Azure Data Explorer @Data Saturd...Time series Analytics - a deep dive into ADX Azure Data Explorer @Data Saturd...
Time series Analytics - a deep dive into ADX Azure Data Explorer @Data Saturd...
 
Database Cloud Services Office Hours - 0421 - Migrate AWS to OCI
Database Cloud Services Office Hours - 0421 - Migrate AWS to OCIDatabase Cloud Services Office Hours - 0421 - Migrate AWS to OCI
Database Cloud Services Office Hours - 0421 - Migrate AWS to OCI
 
Streaming ETL - from RDBMS to Dashboard with KSQL
Streaming ETL - from RDBMS to Dashboard with KSQLStreaming ETL - from RDBMS to Dashboard with KSQL
Streaming ETL - from RDBMS to Dashboard with KSQL
 
Digital transformation with Azure & Azure Stack
Digital transformation with Azure & Azure StackDigital transformation with Azure & Azure Stack
Digital transformation with Azure & Azure Stack
 
Owning time series with team apache Strata San Jose 2015
Owning time series with team apache   Strata San Jose 2015Owning time series with team apache   Strata San Jose 2015
Owning time series with team apache Strata San Jose 2015
 
My sql 5.7-upcoming-changes-v2
My sql 5.7-upcoming-changes-v2My sql 5.7-upcoming-changes-v2
My sql 5.7-upcoming-changes-v2
 

Mais de DataStax Academy

Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
DataStax Academy
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
DataStax Academy
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
DataStax Academy
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
DataStax Academy
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
DataStax Academy
 

Mais de DataStax Academy (20)

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
 
Getting Started with Graph Databases
Getting Started with Graph DatabasesGetting Started with Graph Databases
Getting Started with Graph Databases
 
Cassandra Data Maintenance with Spark
Cassandra Data Maintenance with SparkCassandra Data Maintenance with Spark
Cassandra Data Maintenance with Spark
 
Analytics with Spark and Cassandra
Analytics with Spark and CassandraAnalytics with Spark and Cassandra
Analytics with Spark and Cassandra
 
Make 2016 your year of SMACK talk
Make 2016 your year of SMACK talkMake 2016 your year of SMACK talk
Make 2016 your year of SMACK talk
 
Client Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right WayClient Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right Way
 
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
 

Último

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Último (20)

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Cassandra 3.0 Data Modeling

  • 1. @PatrickMcFadin Patrick McFadin
 Chief Evangelist, DataStax Cassandra 3.0 Data Modeling 1
  • 2. A brief history of CQL You
  • 3. CQL 3.0 - Cassandra 1.2 • Goodbye CQL 2.0! • Custom secondary indexes • Empty IN
  • 4. CQL 3.1 - Cassandra 2.0 • Aliases • CREATE <table> IF NOT EXISTS • INSERT IF NOT EXISTS • UPDATE IF • DELETE IF EXISTS • IN supports cluster columns LWT
  • 5. CQL 3.2 - Cassandra 2.1 • User Defined Types • Collection Indexing • Indexes can use contains • Tuples?
  • 6. User Defined Types CREATE TYPE video_metadata ( height int, width int, video_bit_rate set<text>, encoding text );
  • 7. User Defined Types CREATE TABLE videos ( videoid uuid, userid uuid, name varchar, description varchar, location text, location_type int, preview_thumbnails map<text,text>, tags set<varchar>, metadata set <frozen<video_metadata>>, added_date timestamp, PRIMARY KEY (videoid) );
  • 8. CQL 3.3 - Cassandra 2.2 • Date and Time are now types • TinyInt and SmallInt • User Defined Functions • Aggregates • User Defined Aggregates
  • 9. User Defined Functions CREATE TABLE video_rating ( videoid uuid, rating_counter counter, rating_total counter, PRIMARY KEY (videoid) ); CREATE OR REPLACE FUNCTION avg_rating (rating_counter counter, rating_total counter) CALLED ON NULL INPUT RETURNS double LANGUAGE java AS 'return Double.valueOf(rating_total.doubleValue()/ rating_counter.doubleValue());';
  • 10. User Defined Functions SELECT avg_rating(rating_counter, rating_total) AS avg_rating FROM video_rating WHERE videoid = 99051fe9-6a9c-46c2-b949-38ef78858dd0;
  • 11. Aggregates CREATE TABLE video_ratings_by_user ( videoid uuid, userid uuid, rating int, PRIMARY KEY (videoid, userid) ); SELECT count(userid) FROM video_ratings_by_user WHERE videoed = 49f64d40-7d89-4890-b910-dbf923563a33
  • 12. CQL 3.4 - Cassandra 3.x • CAST operator • Per Partition Limit • Materialized Views • SASI
  • 13. Materialized View CREATE TABLE videos ( videoid uuid, userid uuid, name varchar, description varchar, location text, location_type int, preview_thumbnails map<text,text>, tags set<varchar>, metadata set <frozen<video_metadata>>, added_date timestamp, PRIMARY KEY (videoid) ); Lookup by this?
  • 14. Materialized View CREATE TABLE videos_by_location ( videoid uuid, userid uuid, location text, added_date timestamp, PRIMARY KEY (location, videoid) ); Roll your own
  • 15. Materialized View CREATE MATERIALIZED VIEW videos_by_location AS SELECT userid, added_date, videoid, location FROM videos WHERE videoId IS NOT NULL AND location IS NOT NULL PRIMARY KEY(location, videoid); Cassandra rolls for you
  • 17. Materialized View Perf 5 Materialized Views vs 5 tables writes async
  • 18. Materialized View SELECT location, videoid FROM videos_by_location ; location | videoid -------------------------------------------------+-------------------------------------- http://www.youtube.com/watch?v=px6U2n74q3g | 06049cbb-dfed-421f-b889-5f649a0de1ed http://www.youtube.com/watch?v=qphhxujn5Es | 873ff430-9c23-4e60-be5f-278ea2bb21bd /us/vid/0c/0c3f7e87-f6b6-41d2-9668-2b64d117102c | 0c3f7e87-f6b6-41d2-9668-2b64d117102c /us/vid/b3/b3a76c6b-7c7f-4af6-964f-803a9283c401 | 99051fe9-6a9c-46c2-b949-38ef78858dd0 /us/vid/b3/b3a76c6b-7c7f-4af6-964f-803a9283c401 | b3a76c6b-7c7f-4af6-964f-803a9283c401 http://www.youtube.com/watch?v=HdJlsOZVGwM | 49f64d40-7d89-4890-b910-dbf923563a33 /us/vid/41/416a5ddc-00a5-49ed-adde-d99da9a27c0c | 416a5ddc-00a5-49ed-adde-d99da9a27c0c
  • 19. SASI CREATE TABLE users ( userid uuid, firstname varchar, lastname varchar, email text, created_date timestamp, PRIMARY KEY (userid) ); Lookup by this?
  • 21. SASI
  • 22. SASI CREATE CUSTOM INDEX ON users (firstname) USING 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = { 'analyzer_class': 'org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer', 'case_sensitive': 'false' };
  • 23. SASI CREATE CUSTOM INDEX ON users (lastname) USING 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {'mode': 'CONTAINS'};
  • 24. SASI CREATE CUSTOM INDEX ON users (created_date) USING 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {'mode': 'SPARSE'};
  • 25. SASI Indexes Client INSERT INTO users(userid,firstname,lastname,email,created_date)
 VALUES (9761d3d7-7fbd-4269-9988-6cfd4e188678,’Patrick’,’McFadin’, ’patrick@datastax.com’,’2015-06-01’); userid 1 userid 2 Memtable SSTable SSTable SSTable SASI Index Node Data lastname lastname firstname firstname email email created_date created_date SASI Index SASI Index Indexer
  • 26. SASI Queries SELECT * FROM users WHERE firstname LIKE 'pat%'; SELECT * FROM users WHERE lastname LIKE ‘%Fad%'; SELECT * FROM users WHERE email LIKE '%data%'; SELECT * FROM users WHERE created_date > '2011-6-15' AND created_date < '2011-06-30'; userid | created_date | email | firstname | lastname --------------------------------------+---------------------------------+----------------------+-----------+---------- 9761d3d7-7fbd-4269-9988-6cfd4e188678 | 2011-06-20 20:50:00.000000+0000 | patrick@datastax.com | Patrick | McFadin
  • 27. SASI Guidelines • Multiple fields to search • No more than 1000 rows returned • You know the partition key • Indexing static columns Use SASI when…
  • 28. SASI Guidelines • Searching large partitions • Tight SLA on reads • Search for analytics • Ordering search is important Don’t Use SASI when…