SlideShare a Scribd company logo
1 of 32
NoSQL Data Modeling 101
Tzach Livyatan
Content
■ Basic Data Modeling
● CQL
● Partition Key
● Clustering Key
■ Materialized Views
2
3
NoSQL Vs. Relational
Application
Data Model (Schema)
Model (Schema)
Application Data
Relational
NoSQL
➔ Cluster
◆ Keyspace
● Table
● Partition
● Row
○ Column - name / value pair
4
Data Modeling Terminology
What is CQL
■ Cassandra Query Language
■ Similar to SQL (Structured Query Language)
■ Data Definition (DDL)
● CREATE / DELETE / ALTER Keyspace
● CREATE / DELETE / ALTER Table
■ Data Manipulation (DML)
● SELECT
● INSERT
● UPDATE
● DELETE
● BATCH
5
Keyspace
A top-level object that controls the replication per DC.
Contain tables, index, materialized views and user-defined types.
CREATE KEYSPACE Excalibur
WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1'
: 1, 'DC2' : 3}
AND durable_writes = true;
6
Keyspace Example
CREATE KEYSPACE mykeyspace WITH
replication = {'class':
'NetworkTopologyStrategy',
'AWS_US_EAST_1' : 3} AND
durable_writes = true;
USE mykeyspace;
7
Common Data Types
■ ASCII
■ BIGINT
■ BLOB
■ BOOLEAN
■ COUNTER
■ DATE
■ DECIMAL
■ DOUBLE
■ DURATION
■ FLOAT
■ INET
8
■ INT
■ SMALLINT
■ TEXT
■ TIME
■ TIMESTAMP
■ TIMEUUID
■ TINYINT
■ UUID
■ VARCHAR
■ VARINT
* https://docs.scylladb.com/getting-started/types/
Collections
■ Sets
■ Lists
■ Maps
■ UDT
9
Partition Key
10
11
Key / Value Example
SELECT pet_chip_id,owner,pet_name FROM pet_owner;
pet_chip_id owner pet_name
80d39c78-9dc0-11eb-a8b3-
0242ac130003 642adfee-6ad9-... Buddy
80d39c78-9dc0-11eb-a8b3-
0242ac130003 642adfee-6ad9-... Rocky
80d39c78-9dc0-11eb-a8b3-
0242ac130003 642adfee-6ad9-... Cat
... ... ...
Key / Value Example
CREATE TABLE IF NOT EXISTS pet_owner (
pet_chip_id uuid,
owner uuid,
pet_name text,
PRIMARY KEY (pet_chip_id)
);
Partition Key
pet_chip_id owner pet_name
80d39c78-9dc0-11eb-a8b3-
0242ac130003 642adfee-6ad9-... Buddy
80d39c78-9dc0-11eb-a8b3-
0242ac130003 642adfee-6ad9-... Rocky
80d39c78-9dc0-11eb-a8b3-
0242ac130003 642adfee-6ad9-... Cat
... ... ...
12
13
INSERT INTO pet_owner(pet_chip_id,owner,pet_name) VALUES (a2a60505-3e17-4ad4-8e1a-
f11139caa1cc, 642adfee-6ad9-4ca5-aa32-a72e506b8ad8, 'Buddy');
INSERT INTO pet_owner(pet_chip_id,owner,pet_name) VALUES (80d39c78-9dc0-11eb-a8b3-
0242ac130003, 642adfee-6ad9-4ca5-aa32-a72e506b8ad8, 'Rocky');
INSERT INTO pet_owner(pet_chip_id,owner,pet_name) VALUES (92cf4f94-9dc0-11eb-a8b3-
0242ac130003, b4a63c18-9dc0-11eb-a8b3-0242ac130003, 'Rin Tin Tin');
SELECT * FROM pet_owner;
SELECT * FROM pet_owner WHERE pet_chip_id = 80d39c78-9dc0-11eb-a8b3-0242ac130003;
SELECT * FROM pet_owner WHERE pet_name = 'Rocky'; (?)
Key / Value Example
14
UPDATE pet_owner SET pet_name = 'Cat' WHERE pet_chip_id = 92cf4f94-9dc0-11eb-
a8b3-0242ac130003;
DELETE FROM pet_owner WHERE pet_chip_id = 80d39c78-9dc0-11eb-a8b3-0242ac130003;
SELECT * FROM pet_owner;
Key / Value Example
Key / Value Example
15
Choosing a Partition Key
■ High Cardinality
■ Even Distribution
Avoid
■ Low Cardinality
■ Hot Partition
■ Large Partition
16
https://www.codedrome.com/zipfs-law-in-python/
Choosing a Partition Key
17
■ User Name
■ User ID
■ User ID + Time
■ Sensor ID
■ Sensor ID + Time
■ Customer
■ State
■ Age
■ Favorite NBA Team
■ Team Angel or Team Spike
https://commons.wikimedia.org/
Query:
SELECT * from heartrate_v10 WHERE
pet_chip_id = 80d39c78-9dc0-11eb-a8b3-0242ac130003 LIMIT 1;
SELECT * from heartrate_v10 WHERE
pet_chip_id = 80d39c78-9dc0-11eb-a8b3-0242ac130003 AND
time >= '2021-05-01 01:00+0000' AND
time < '2021-05-01 01:03+0000';
18
https://gist.github.com/tzach/7486f1a0cc904c52f4514f20f14d2a97
Wide Partition Example
Wide Partition Example
CREATE TABLE heartrate_v10 (
pet_chip_id uuid,
owner uuid,
time timestamp,
heart_rate int,
PRIMARY KEY (pet_chip_id, time)
);
pet_chip_id time heart_rate
80d39c78-9dc0-11eb-a8b3-0242ac130003 2021-05-01 01:00:00.000000+0000 120
80d39c78-9dc0-11eb-a8b3-0242ac130003 2021-05-01 01:01:00.000000+0000 121
80d39c78-9dc0-11eb-a8b3-0242ac130003 2021-05-01 01:02:00.000000+0000 120
Partition Key Clustering Key
19
20
Large
Partition?
Wide Partition Example
Choosing a Clustering Key
21
■ Allow useful range queries
■ Allow useful LIMIT
https://commons.wikimedia.org/
22
SELECT * from heartrate_v10 WHERE
pet_chip_id = 80d39c78-9dc0-11eb-a8b3-0242ac130003 LIMIT 1;
CREATE TABLE heartrate_v5 (
pet_chip_id uuid,
time timestamp,
heart_rate int,
PRIMARY KEY (pet_chip_id, time)
) WITH CLUSTERING ORDER BY (time DESC);
Partition Key Clustering Key
23
CREATE TABLE heartrate_v6 (
pet_chip_id uuid,
date text,
time timestamp,
heart_rate int,
PRIMARY KEY ((pet_chip_id, date), time));
Partition Key Clustering Key
Too Wide Partition ?
Materialized Views
Example - Query by Owner
SELECT * FROM heartrate_v10 WHERE pet_chip_id = a2a60505-3e17-4ad4-8e1a-
f11139caa1cc;
SELECT * FROM heartrate_v10 WHERE owner = 642adfee-6ad9-4ca5-aa32-
a72e506b8ad8;
SELECT * FROM heartrate_v10 WHERE owner = 642adfee-6ad9-4ca5-aa32-
a72e506b8ad8 ALLOW FILTERING;
25
https://gist.github.com/tzach/4b9dadbc6e8a9c50369da05631c5e13e
Try
TRACING ON;
TRACING OFF;
Solution - Materialized Views
CREATE TABLE heartrate_v10 (
pet_chip_id uuid, owner uuid, time timestamp, heart_rate int,
PRIMARY KEY (pet_chip_id, time)
);
SELECT * FROM heartrate_by_owner WHERE owner = 642adfee-6ad9-4ca5-aa32-
a72e506b8ad8;
CREATE MATERIALIZED VIEW heartrate_by_owner AS
SELECT * FROM heartrate_v10
WHERE owner IS NOT NULL AND pet_chip_id IS NOT NULL AND time IS NOT NULL
PRIMARY KEY(owner, pet_chip_id, time);
DROP MATERIALIZED VIEW heartrate_by_owner;
ALTER MATERIALIZED VIEW heartrate_by_owner [WITH table_options];
https://docs.scylladb.com/getting-started/mv/ 26
Example
27
pet_chip_id time owner heart_rate
80d39c78-9dc0-11eb-a8b3-
0242ac130003 2021-05-01 01:00:00.000000+0000
642adfee-6ad9-4ca5-aa32-
a72e506b8ad8 120
80d39c78-9dc0-11eb-a8b3-
0242ac130003 2021-05-01 01:01:00.000000+0000
642adfee-6ad9-4ca5-aa32-
a72e506b8ad8 121
80d39c78-9dc0-11eb-a8b3-
0242ac130003 2021-05-01 01:02:00.000000+0000
642adfee-6ad9-4ca5-aa32-
a72e506b8ad8 120
owner pet_chip_id time heart_rate
642adfee-6ad9-4ca5-aa32-
a72e506b8ad8
80d39c78-9dc0-11eb-a8b3-
0242ac130003 2021-05-01 01:00:00.000000+0000 120
642adfee-6ad9-4ca5-aa32-
a72e506b8ad8
80d39c78-9dc0-11eb-a8b3-
0242ac130003 2021-05-01 01:01:00.000000+0000 121
642adfee-6ad9-4ca5-aa32-
a72e506b8ad8
80d39c78-9dc0-11eb-a8b3-
0242ac130003 2021-05-01 01:02:00.000000+0000 120
Base Table
View
28
1. INSERT INTO heartrate
(pet_chip_id,
Owner,
Time,
heart_rate)
VALUES (..);
2. INSERT INTO
heartrate
Base replica
View replica
Coordinator
3. INSERT INTO
heartrate_by_owner
MV - Write Path
29
MV - Read Path
2.
SELECT * FROM
heartrate_by_owner
WHERE owner = ‘642a..’;
Base replica
View replica
Coordinator
1.
SELECT * FROM
heartrate_by_owner
WHERE owner = ‘642a..’;
30
http://localhost:3000/d/overview-2019-1/overview 31
Keep in touch!
Tzach Livyatan
ScyllaDB
tzach@scylladb.com
@tzachL

More Related Content

What's hot

Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013Jaime Crespo
 
Open Source 101 2022 - MySQL Indexes and Histograms
Open Source 101 2022 - MySQL Indexes and HistogramsOpen Source 101 2022 - MySQL Indexes and Histograms
Open Source 101 2022 - MySQL Indexes and HistogramsFrederic Descamps
 
ClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovAltinity Ltd
 
Neo4j Drivers Best Practices
Neo4j Drivers Best PracticesNeo4j Drivers Best Practices
Neo4j Drivers Best PracticesNeo4j
 
MySQL Data Encryption at Rest
MySQL Data Encryption at RestMySQL Data Encryption at Rest
MySQL Data Encryption at RestMydbops
 
MySQL Query And Index Tuning
MySQL Query And Index TuningMySQL Query And Index Tuning
MySQL Query And Index TuningManikanda kumar
 
Amazon Redshift: Performance Tuning and Optimization
Amazon Redshift: Performance Tuning and OptimizationAmazon Redshift: Performance Tuning and Optimization
Amazon Redshift: Performance Tuning and OptimizationAmazon Web Services
 
More mastering the art of indexing
More mastering the art of indexingMore mastering the art of indexing
More mastering the art of indexingYoshinori Matsunobu
 
How to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better PerformanceHow to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better Performanceoysteing
 
MySQL Performance Schema in Action: the Complete Tutorial
MySQL Performance Schema in Action: the Complete TutorialMySQL Performance Schema in Action: the Complete Tutorial
MySQL Performance Schema in Action: the Complete TutorialSveta Smirnova
 
Understanding PostgreSQL LW Locks
Understanding PostgreSQL LW LocksUnderstanding PostgreSQL LW Locks
Understanding PostgreSQL LW LocksJignesh Shah
 
Percona Live 2022 - The Evolution of a MySQL Database System
Percona Live 2022 - The Evolution of a MySQL Database SystemPercona Live 2022 - The Evolution of a MySQL Database System
Percona Live 2022 - The Evolution of a MySQL Database SystemFrederic Descamps
 
Scaling for Performance
Scaling for PerformanceScaling for Performance
Scaling for PerformanceScyllaDB
 
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark Summit
 

What's hot (20)

MySQL Router REST API
MySQL Router REST APIMySQL Router REST API
MySQL Router REST API
 
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
 
Open Source 101 2022 - MySQL Indexes and Histograms
Open Source 101 2022 - MySQL Indexes and HistogramsOpen Source 101 2022 - MySQL Indexes and Histograms
Open Source 101 2022 - MySQL Indexes and Histograms
 
ClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei Milovidov
 
How to Use JSON in MySQL Wrong
How to Use JSON in MySQL WrongHow to Use JSON in MySQL Wrong
How to Use JSON in MySQL Wrong
 
Neo4j Drivers Best Practices
Neo4j Drivers Best PracticesNeo4j Drivers Best Practices
Neo4j Drivers Best Practices
 
MySQL Data Encryption at Rest
MySQL Data Encryption at RestMySQL Data Encryption at Rest
MySQL Data Encryption at Rest
 
MySQL Query And Index Tuning
MySQL Query And Index TuningMySQL Query And Index Tuning
MySQL Query And Index Tuning
 
Amazon Redshift: Performance Tuning and Optimization
Amazon Redshift: Performance Tuning and OptimizationAmazon Redshift: Performance Tuning and Optimization
Amazon Redshift: Performance Tuning and Optimization
 
PostgreSQL: Advanced indexing
PostgreSQL: Advanced indexingPostgreSQL: Advanced indexing
PostgreSQL: Advanced indexing
 
More mastering the art of indexing
More mastering the art of indexingMore mastering the art of indexing
More mastering the art of indexing
 
How to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better PerformanceHow to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better Performance
 
How to Design Indexes, Really
How to Design Indexes, ReallyHow to Design Indexes, Really
How to Design Indexes, Really
 
MySQL Performance Schema in Action: the Complete Tutorial
MySQL Performance Schema in Action: the Complete TutorialMySQL Performance Schema in Action: the Complete Tutorial
MySQL Performance Schema in Action: the Complete Tutorial
 
Understanding PostgreSQL LW Locks
Understanding PostgreSQL LW LocksUnderstanding PostgreSQL LW Locks
Understanding PostgreSQL LW Locks
 
Automated master failover
Automated master failoverAutomated master failover
Automated master failover
 
Percona Live 2022 - The Evolution of a MySQL Database System
Percona Live 2022 - The Evolution of a MySQL Database SystemPercona Live 2022 - The Evolution of a MySQL Database System
Percona Live 2022 - The Evolution of a MySQL Database System
 
Scaling for Performance
Scaling for PerformanceScaling for Performance
Scaling for Performance
 
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
 
SQL Tuning 101
SQL Tuning 101SQL Tuning 101
SQL Tuning 101
 

Similar to NoSQL Data Modeling 101

SequoiaDB Distributed Relational Database
SequoiaDB Distributed Relational DatabaseSequoiaDB Distributed Relational Database
SequoiaDB Distributed Relational Databasewangzhonnew
 
Using Spark to Load Oracle Data into Cassandra
Using Spark to Load Oracle Data into CassandraUsing Spark to Load Oracle Data into Cassandra
Using Spark to Load Oracle Data into CassandraJim Hatcher
 
Using Spark to Load Oracle Data into Cassandra (Jim Hatcher, IHS Markit) | C*...
Using Spark to Load Oracle Data into Cassandra (Jim Hatcher, IHS Markit) | C*...Using Spark to Load Oracle Data into Cassandra (Jim Hatcher, IHS Markit) | C*...
Using Spark to Load Oracle Data into Cassandra (Jim Hatcher, IHS Markit) | C*...DataStax
 
OpenWorld Sep14 12c for_developers
OpenWorld Sep14 12c for_developersOpenWorld Sep14 12c for_developers
OpenWorld Sep14 12c for_developersConnor McDonald
 
Wait Events 10g
Wait Events 10gWait Events 10g
Wait Events 10gsagai
 
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스PgDay.Seoul
 
Cassandra Summit 2013 Keynote
Cassandra Summit 2013 KeynoteCassandra Summit 2013 Keynote
Cassandra Summit 2013 Keynotejbellis
 
Cassandra Summit 2015: Intro to DSE Search
Cassandra Summit 2015: Intro to DSE SearchCassandra Summit 2015: Intro to DSE Search
Cassandra Summit 2015: Intro to DSE SearchCaleb Rackliffe
 
DataStax: An Introduction to DataStax Enterprise Search
DataStax: An Introduction to DataStax Enterprise SearchDataStax: An Introduction to DataStax Enterprise Search
DataStax: An Introduction to DataStax Enterprise SearchDataStax Academy
 
DNN Database Tips & Tricks
DNN Database Tips & TricksDNN Database Tips & Tricks
DNN Database Tips & TricksWill Strohl
 
Proxysql sharding
Proxysql shardingProxysql sharding
Proxysql shardingMarco Tusa
 
PostgreSQL Open SV 2018
PostgreSQL Open SV 2018PostgreSQL Open SV 2018
PostgreSQL Open SV 2018artgillespie
 
New SQL features in latest MySQL releases
New SQL features in latest MySQL releasesNew SQL features in latest MySQL releases
New SQL features in latest MySQL releasesGeorgi Sotirov
 
Time series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long versionTime series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long versionPatrick McFadin
 
CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013
CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013
CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013MariaDB Corporation
 
Tech Talk: Best Practices for Data Modeling
Tech Talk: Best Practices for Data ModelingTech Talk: Best Practices for Data Modeling
Tech Talk: Best Practices for Data ModelingScyllaDB
 
Understanding Optimizer-Statistics-for-Developers
Understanding Optimizer-Statistics-for-DevelopersUnderstanding Optimizer-Statistics-for-Developers
Understanding Optimizer-Statistics-for-DevelopersEnkitec
 
Oracle Database 12c Application Development
Oracle Database 12c Application DevelopmentOracle Database 12c Application Development
Oracle Database 12c Application DevelopmentSaurabh K. Gupta
 
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...StampedeCon
 

Similar to NoSQL Data Modeling 101 (20)

SequoiaDB Distributed Relational Database
SequoiaDB Distributed Relational DatabaseSequoiaDB Distributed Relational Database
SequoiaDB Distributed Relational Database
 
Using Spark to Load Oracle Data into Cassandra
Using Spark to Load Oracle Data into CassandraUsing Spark to Load Oracle Data into Cassandra
Using Spark to Load Oracle Data into Cassandra
 
Using Spark to Load Oracle Data into Cassandra (Jim Hatcher, IHS Markit) | C*...
Using Spark to Load Oracle Data into Cassandra (Jim Hatcher, IHS Markit) | C*...Using Spark to Load Oracle Data into Cassandra (Jim Hatcher, IHS Markit) | C*...
Using Spark to Load Oracle Data into Cassandra (Jim Hatcher, IHS Markit) | C*...
 
OpenWorld Sep14 12c for_developers
OpenWorld Sep14 12c for_developersOpenWorld Sep14 12c for_developers
OpenWorld Sep14 12c for_developers
 
Wait Events 10g
Wait Events 10gWait Events 10g
Wait Events 10g
 
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
 
Cassandra Summit 2013 Keynote
Cassandra Summit 2013 KeynoteCassandra Summit 2013 Keynote
Cassandra Summit 2013 Keynote
 
Cassandra Summit 2015: Intro to DSE Search
Cassandra Summit 2015: Intro to DSE SearchCassandra Summit 2015: Intro to DSE Search
Cassandra Summit 2015: Intro to DSE Search
 
DataStax: An Introduction to DataStax Enterprise Search
DataStax: An Introduction to DataStax Enterprise SearchDataStax: An Introduction to DataStax Enterprise Search
DataStax: An Introduction to DataStax Enterprise Search
 
DNN Database Tips & Tricks
DNN Database Tips & TricksDNN Database Tips & Tricks
DNN Database Tips & Tricks
 
Proxysql sharding
Proxysql shardingProxysql sharding
Proxysql sharding
 
Sql
SqlSql
Sql
 
PostgreSQL Open SV 2018
PostgreSQL Open SV 2018PostgreSQL Open SV 2018
PostgreSQL Open SV 2018
 
New SQL features in latest MySQL releases
New SQL features in latest MySQL releasesNew SQL features in latest MySQL releases
New SQL features in latest MySQL releases
 
Time series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long versionTime series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long version
 
CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013
CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013
CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013
 
Tech Talk: Best Practices for Data Modeling
Tech Talk: Best Practices for Data ModelingTech Talk: Best Practices for Data Modeling
Tech Talk: Best Practices for Data Modeling
 
Understanding Optimizer-Statistics-for-Developers
Understanding Optimizer-Statistics-for-DevelopersUnderstanding Optimizer-Statistics-for-Developers
Understanding Optimizer-Statistics-for-Developers
 
Oracle Database 12c Application Development
Oracle Database 12c Application DevelopmentOracle Database 12c Application Development
Oracle Database 12c Application Development
 
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
 

More from ScyllaDB

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLScyllaDB
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasScyllaDB
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasScyllaDB
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...ScyllaDB
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...ScyllaDB
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaScyllaDB
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityScyllaDB
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDBScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationScyllaDB
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsScyllaDB
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesScyllaDB
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsScyllaDB
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBScyllaDB
 
Optimizing Performance in Rust for Low-Latency Database Drivers
Optimizing Performance in Rust for Low-Latency Database DriversOptimizing Performance in Rust for Low-Latency Database Drivers
Optimizing Performance in Rust for Low-Latency Database DriversScyllaDB
 
Overcoming Media Streaming Challenges with NoSQL
Overcoming Media Streaming Challenges with NoSQLOvercoming Media Streaming Challenges with NoSQL
Overcoming Media Streaming Challenges with NoSQLScyllaDB
 

More from ScyllaDB (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQL
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & Pitfalls
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual Workshop
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & Tradeoffs
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDB
 
Optimizing Performance in Rust for Low-Latency Database Drivers
Optimizing Performance in Rust for Low-Latency Database DriversOptimizing Performance in Rust for Low-Latency Database Drivers
Optimizing Performance in Rust for Low-Latency Database Drivers
 
Overcoming Media Streaming Challenges with NoSQL
Overcoming Media Streaming Challenges with NoSQLOvercoming Media Streaming Challenges with NoSQL
Overcoming Media Streaming Challenges with NoSQL
 

Recently uploaded

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 

Recently uploaded (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 

NoSQL Data Modeling 101

  • 1. NoSQL Data Modeling 101 Tzach Livyatan
  • 2. Content ■ Basic Data Modeling ● CQL ● Partition Key ● Clustering Key ■ Materialized Views 2
  • 3. 3 NoSQL Vs. Relational Application Data Model (Schema) Model (Schema) Application Data Relational NoSQL
  • 4. ➔ Cluster ◆ Keyspace ● Table ● Partition ● Row ○ Column - name / value pair 4 Data Modeling Terminology
  • 5. What is CQL ■ Cassandra Query Language ■ Similar to SQL (Structured Query Language) ■ Data Definition (DDL) ● CREATE / DELETE / ALTER Keyspace ● CREATE / DELETE / ALTER Table ■ Data Manipulation (DML) ● SELECT ● INSERT ● UPDATE ● DELETE ● BATCH 5
  • 6. Keyspace A top-level object that controls the replication per DC. Contain tables, index, materialized views and user-defined types. CREATE KEYSPACE Excalibur WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1' : 1, 'DC2' : 3} AND durable_writes = true; 6
  • 7. Keyspace Example CREATE KEYSPACE mykeyspace WITH replication = {'class': 'NetworkTopologyStrategy', 'AWS_US_EAST_1' : 3} AND durable_writes = true; USE mykeyspace; 7
  • 8. Common Data Types ■ ASCII ■ BIGINT ■ BLOB ■ BOOLEAN ■ COUNTER ■ DATE ■ DECIMAL ■ DOUBLE ■ DURATION ■ FLOAT ■ INET 8 ■ INT ■ SMALLINT ■ TEXT ■ TIME ■ TIMESTAMP ■ TIMEUUID ■ TINYINT ■ UUID ■ VARCHAR ■ VARINT * https://docs.scylladb.com/getting-started/types/
  • 11. 11 Key / Value Example SELECT pet_chip_id,owner,pet_name FROM pet_owner; pet_chip_id owner pet_name 80d39c78-9dc0-11eb-a8b3- 0242ac130003 642adfee-6ad9-... Buddy 80d39c78-9dc0-11eb-a8b3- 0242ac130003 642adfee-6ad9-... Rocky 80d39c78-9dc0-11eb-a8b3- 0242ac130003 642adfee-6ad9-... Cat ... ... ...
  • 12. Key / Value Example CREATE TABLE IF NOT EXISTS pet_owner ( pet_chip_id uuid, owner uuid, pet_name text, PRIMARY KEY (pet_chip_id) ); Partition Key pet_chip_id owner pet_name 80d39c78-9dc0-11eb-a8b3- 0242ac130003 642adfee-6ad9-... Buddy 80d39c78-9dc0-11eb-a8b3- 0242ac130003 642adfee-6ad9-... Rocky 80d39c78-9dc0-11eb-a8b3- 0242ac130003 642adfee-6ad9-... Cat ... ... ... 12
  • 13. 13 INSERT INTO pet_owner(pet_chip_id,owner,pet_name) VALUES (a2a60505-3e17-4ad4-8e1a- f11139caa1cc, 642adfee-6ad9-4ca5-aa32-a72e506b8ad8, 'Buddy'); INSERT INTO pet_owner(pet_chip_id,owner,pet_name) VALUES (80d39c78-9dc0-11eb-a8b3- 0242ac130003, 642adfee-6ad9-4ca5-aa32-a72e506b8ad8, 'Rocky'); INSERT INTO pet_owner(pet_chip_id,owner,pet_name) VALUES (92cf4f94-9dc0-11eb-a8b3- 0242ac130003, b4a63c18-9dc0-11eb-a8b3-0242ac130003, 'Rin Tin Tin'); SELECT * FROM pet_owner; SELECT * FROM pet_owner WHERE pet_chip_id = 80d39c78-9dc0-11eb-a8b3-0242ac130003; SELECT * FROM pet_owner WHERE pet_name = 'Rocky'; (?) Key / Value Example
  • 14. 14 UPDATE pet_owner SET pet_name = 'Cat' WHERE pet_chip_id = 92cf4f94-9dc0-11eb- a8b3-0242ac130003; DELETE FROM pet_owner WHERE pet_chip_id = 80d39c78-9dc0-11eb-a8b3-0242ac130003; SELECT * FROM pet_owner; Key / Value Example
  • 15. Key / Value Example 15
  • 16. Choosing a Partition Key ■ High Cardinality ■ Even Distribution Avoid ■ Low Cardinality ■ Hot Partition ■ Large Partition 16 https://www.codedrome.com/zipfs-law-in-python/
  • 17. Choosing a Partition Key 17 ■ User Name ■ User ID ■ User ID + Time ■ Sensor ID ■ Sensor ID + Time ■ Customer ■ State ■ Age ■ Favorite NBA Team ■ Team Angel or Team Spike https://commons.wikimedia.org/
  • 18. Query: SELECT * from heartrate_v10 WHERE pet_chip_id = 80d39c78-9dc0-11eb-a8b3-0242ac130003 LIMIT 1; SELECT * from heartrate_v10 WHERE pet_chip_id = 80d39c78-9dc0-11eb-a8b3-0242ac130003 AND time >= '2021-05-01 01:00+0000' AND time < '2021-05-01 01:03+0000'; 18 https://gist.github.com/tzach/7486f1a0cc904c52f4514f20f14d2a97 Wide Partition Example
  • 19. Wide Partition Example CREATE TABLE heartrate_v10 ( pet_chip_id uuid, owner uuid, time timestamp, heart_rate int, PRIMARY KEY (pet_chip_id, time) ); pet_chip_id time heart_rate 80d39c78-9dc0-11eb-a8b3-0242ac130003 2021-05-01 01:00:00.000000+0000 120 80d39c78-9dc0-11eb-a8b3-0242ac130003 2021-05-01 01:01:00.000000+0000 121 80d39c78-9dc0-11eb-a8b3-0242ac130003 2021-05-01 01:02:00.000000+0000 120 Partition Key Clustering Key 19
  • 21. Choosing a Clustering Key 21 ■ Allow useful range queries ■ Allow useful LIMIT https://commons.wikimedia.org/
  • 22. 22 SELECT * from heartrate_v10 WHERE pet_chip_id = 80d39c78-9dc0-11eb-a8b3-0242ac130003 LIMIT 1; CREATE TABLE heartrate_v5 ( pet_chip_id uuid, time timestamp, heart_rate int, PRIMARY KEY (pet_chip_id, time) ) WITH CLUSTERING ORDER BY (time DESC); Partition Key Clustering Key
  • 23. 23 CREATE TABLE heartrate_v6 ( pet_chip_id uuid, date text, time timestamp, heart_rate int, PRIMARY KEY ((pet_chip_id, date), time)); Partition Key Clustering Key Too Wide Partition ?
  • 25. Example - Query by Owner SELECT * FROM heartrate_v10 WHERE pet_chip_id = a2a60505-3e17-4ad4-8e1a- f11139caa1cc; SELECT * FROM heartrate_v10 WHERE owner = 642adfee-6ad9-4ca5-aa32- a72e506b8ad8; SELECT * FROM heartrate_v10 WHERE owner = 642adfee-6ad9-4ca5-aa32- a72e506b8ad8 ALLOW FILTERING; 25 https://gist.github.com/tzach/4b9dadbc6e8a9c50369da05631c5e13e Try TRACING ON; TRACING OFF;
  • 26. Solution - Materialized Views CREATE TABLE heartrate_v10 ( pet_chip_id uuid, owner uuid, time timestamp, heart_rate int, PRIMARY KEY (pet_chip_id, time) ); SELECT * FROM heartrate_by_owner WHERE owner = 642adfee-6ad9-4ca5-aa32- a72e506b8ad8; CREATE MATERIALIZED VIEW heartrate_by_owner AS SELECT * FROM heartrate_v10 WHERE owner IS NOT NULL AND pet_chip_id IS NOT NULL AND time IS NOT NULL PRIMARY KEY(owner, pet_chip_id, time); DROP MATERIALIZED VIEW heartrate_by_owner; ALTER MATERIALIZED VIEW heartrate_by_owner [WITH table_options]; https://docs.scylladb.com/getting-started/mv/ 26
  • 27. Example 27 pet_chip_id time owner heart_rate 80d39c78-9dc0-11eb-a8b3- 0242ac130003 2021-05-01 01:00:00.000000+0000 642adfee-6ad9-4ca5-aa32- a72e506b8ad8 120 80d39c78-9dc0-11eb-a8b3- 0242ac130003 2021-05-01 01:01:00.000000+0000 642adfee-6ad9-4ca5-aa32- a72e506b8ad8 121 80d39c78-9dc0-11eb-a8b3- 0242ac130003 2021-05-01 01:02:00.000000+0000 642adfee-6ad9-4ca5-aa32- a72e506b8ad8 120 owner pet_chip_id time heart_rate 642adfee-6ad9-4ca5-aa32- a72e506b8ad8 80d39c78-9dc0-11eb-a8b3- 0242ac130003 2021-05-01 01:00:00.000000+0000 120 642adfee-6ad9-4ca5-aa32- a72e506b8ad8 80d39c78-9dc0-11eb-a8b3- 0242ac130003 2021-05-01 01:01:00.000000+0000 121 642adfee-6ad9-4ca5-aa32- a72e506b8ad8 80d39c78-9dc0-11eb-a8b3- 0242ac130003 2021-05-01 01:02:00.000000+0000 120 Base Table View
  • 28. 28
  • 29. 1. INSERT INTO heartrate (pet_chip_id, Owner, Time, heart_rate) VALUES (..); 2. INSERT INTO heartrate Base replica View replica Coordinator 3. INSERT INTO heartrate_by_owner MV - Write Path 29
  • 30. MV - Read Path 2. SELECT * FROM heartrate_by_owner WHERE owner = ‘642a..’; Base replica View replica Coordinator 1. SELECT * FROM heartrate_by_owner WHERE owner = ‘642a..’; 30
  • 32. Keep in touch! Tzach Livyatan ScyllaDB tzach@scylladb.com @tzachL

Editor's Notes

  1. Tzach - VP of Product Session is available in Scylla U as a course
  2. Let’s go over some important terms: A Cluster is a collection of nodes that Scylla uses to store the data. The nodes are logically distributed like a ring. A minimum cluster typically consists of at least three nodes. Data is automatically replicated across the cluster, depending on the Replication Factor. This cluster is often referred to as a ring architecture, based on a hash ring — the way the cluster knows how to distribute data across the different nodes. A Keyspace is a top-level container that stores tables with attributes that define how data is replicated on nodes. It defines a number of options that apply to all the tables it contains, the most important of which is the replication strategy used by the Keyspace. A keyspace is comparable to the concept of a database Schema in the relational world. Since the keyspace defines the replication factor of all underlying tables, if we have tables that require different replication factors we would store them in different keyspaces. A Table is how Scylla stores data and can be thought of as a set of rows and columns. A Partition is a collection of sorted rows, identified by a unique primary key. More on primary keys later on in this session. Each partition is stored on a node and replicated across nodes. A Row in Scylla is a unit that stores data. Each row has a primary key that uniquely identifies it in a Table. Each row stores data as pairs of column names and values. In case a Clustering Key is defined, the rows in the partition will be sorted accordingly. More on that later on.
  3. CQL is a query language that is used to interface with Scylla. It allows us to perform basic functions such as insert, update, select, delete, create, and so on. CQL is in some ways similar to SQL however there are some differences.
  4. replication The replication strategy and options to use for the keyspace (see details below). durable_writes Whether to use the commit log for updates on this keyspace (disable this option at your own risk!).
  5. Share a terminal > ty-share
  6. Before we create a table, we need to know: Data types Keys Table
  7. Collections are used to describe a group of items connected to single key -> helps with simplifying data modeling Remember to use appropriate collection per use case Keep collection small to prevent high latency during querying the data Sets are ordered alphabetically or based on the natural sorting method of the type Examples: multiple email addresses or phone numbers per user Lists are ordered objects based on user’s definition Maps is a name and a pair of typed values, very helpful with a sequential events logging Summary: Collections helps users with organizing their data Collections should be used in adequate cases, due to performance impact
  8. A Partition Key is one or more columns that are responsible for data distribution across the nodes. It determines in which node to store a given row. Partition Key is a must on every table. In the example below the Partition Key is the ID column. A consistent hash function, also known as the partitioner, is used to determine to which nodes data is written. Scylla transparently partitions data and distributes it to the cluster. Data is replicated across the cluster. A Scylla cluster is visualized as a ring, where each node is responsible for a range of tokens and each value is attached to a token using a partition key
  9. Allow fast query for pet, and just for pets! PRIMARY KEY = Partition + Clustering Key
  10. https://gist.github.com/tzach/7486f1a0cc904c52f4514f20f14d2a97
  11. Why is large partition a problem? Is it a problem? Large may lead to got Index implementation (no longer an issue in Scylla)
  12. By default, sorting is based on the natural (ASC) order of the clustering columns. What happens if we want to reverse the order? What if our query is to find the heart rate by pet_chip_id and time, but that we want to look at the ten most recent records.
  13. By default, sorting is based on the natural (ASC) order of the clustering columns. What happens if we want to reverse the order? What if our query is to find the heart rate by pet_chip_id and time, but that we want to look at the ten most recent records.
  14. Now that we see that we are able to query each individual pet, what about their owners? Let’s try Scylla will output an error message, saying that the query might hurt the performance, if you want to query anyway you should use ALLOW FILTERING Works Scylla raises an error since we are querying a regular column which is not indexed and it will hurt the performance because scylla will do a FULL SCAN on the partition, meaning that will read the entire partition to filter it after. Use TRACING to see how much the performance will be affected One way to solve this problem: create a table using owner id and another for pet id and on the application we do dual writes. The problem here is that we now need to make sure that both table are synchronized.
  15. MVs - Is a new table that it’s updated automatically by the base table Show syntax
  16. But if we create a view and make owner as the partition key and then we can query the view by it’s partition key (owner)
  17. Everytime that a insert is received by the Coordinator, scylla will insert into the base table and updated the mutations on the relevant updates on MVs replicas They are synchronously and works as any other table except Scylla will reject writes done directly on the materialized views No magic, thats a tradeoff between read latency and disk space Every MV that you create, you will need more space for its creation
  18. When querying the MV specifically - scylla will query the MV - low latency
  19. See Shlomi Session after me of the second track