SlideShare uma empresa Scribd logo
1 de 58
Baixar para ler offline
Five data models for sharding
Craig Kerstiens, Head of Cloud at Citus
Who am I?
Craig Kerstiens
http://www.craigkerstiens.com
@craigkerstiens
Postgres weekly
Run Citus Cloud
Co-chair PostgresOpen
What is sharding
Practice of separating a large database into smaller, faster, more easily
managed parts called data shards.
Source: the internet
LogicalPhysical
What is a shard?
A table
A schema
A PG database
A node
An instance
A PG cluster
Is it a shard?
NO YES
2 Nodes - 32 shards
4 nodes - still 32 shards
Five models
• Geography
• Multi-tenant
• Entity id
• Graph model
• Time series
But first, two approaches
RangeHash
Hash - The steps
1. Hash your id
2. Define a shard range x shards, and each contain some range of hash
values. Route all inserts/updates/deletes to the shard
3. Profit
More details
• Hash based on some id
• Postgres internal hash can work fine, or so can your own
• Define your number of shards up front, make this larger than you expect
to grow to in terms of nodes
• (2 is bad)
• (2 million is also bad)
• Factors of 2 are nice, but not actually required
Don’t just route values
• 1-10 -> shard 1
• 2-20 -> shard 2
Create range of hash values
• hash 1 = 46154
• hash 2 = 27193
• Shard 13 = ranges 26624 to 28672
Range - The steps
1. Ensure you’ve created your new destination for your range
2. Route your range to the right bucket
3. Profit
@citusdatawww.citusdata.com
Thanks
Craig Kerstiens
@craigkerstiens
Five models
• Geography
• Multi-tenant
• Entity id
• Graph model
• Time series
Click to edit master tile stylegeography
Shard by Geography
• Is there a clear line I can draw for a geographical boundary
• Good examples: income by state, healthcare, etc.
• Bad examples:
• Text messages: 256 sends to 510, both want a copy of this data…
Will geography sharding work for you?
• Do you join across geographies?
• Does data easily cross boundaries?
• Is data queries across boundaries or a different access frequently?
More specifics
• Granular vs. broad
• State vs. zip code
• (California and texas are bad)
• Zip codes might work, but does that work for your app?
Common use cases
• If your go to market is geography focused
• Instacart/Shipt
• Uber/Lyft
Real world application
• Range sharding makes moving things around harder here
• Combining the geography and giving each and id, then hashing (but using
smaller set of shards) can give better balance to your data
multi-tenant
Sharding by tenant
• Is each customer’s data their own?
• What’s your data’s distribution?
• (If one tenant/customer is 50% of your data tenant sharding won’t help)
• If it’s 10% of your data you may be okay
Common use cases
• Saas/B2B
• Salesforce
• Marketing automation
• Any online SaaS
Guidelines for multi-tenant sharding
• Put your tenant_id on every table that’s relevant
• Yes, denormalize
• Ensure primary keys and foreign keys are composite ones (with
tenant_id)
• Enforce your tenant_id is on all queries so things are appropriately
scoped
Salesforce schema
CREATE TABLE leads (
id serial primary key,
first_name text,
last_name text,
email text
);
CREATE TABLE accounts (
id serial primary key,
name text,
state varchar(2),
size int
);
CREATE TABLE opportunity (
id serial primary key,
name text,
amount int
);
Salesforce schema - with orgs
CREATE TABLE leads (
id serial primary key,
first_name text,
last_name text,
email text,
org_id int
);
CREATE TABLE accounts (
id serial primary key,
name text,
state varchar(2),
size int
org_id int
);
CREATE TABLE opportunity (
id serial primary key,
name text,
amount int
org_id int
);
Salesforce schema - with orgs
CREATE TABLE leads (
id serial primary key,
first_name text,
last_name text,
email text,
org_id int
);
CREATE TABLE accounts (
id serial primary key,
name text,
state varchar(2),
size int
org_id int
);
CREATE TABLE opportunity (
id serial primary key,
name text,
amount int
org_id int
);
Salesforce schema - with keys
CREATE TABLE leads (
id serial,
first_name text,
last_name text,
email text,
org_id int,
primary key (org_id, id)
);
CREATE TABLE accounts (
id serial,
name text,
state varchar(2),
size int,
org_id int,
primary key (org_id, id)
);
CREATE TABLE opportunity (
id serial,
name text,
amount int,
Salesforce schema - with keys
CREATE TABLE leads (
id serial,
first_name text,
last_name text,
email text,
org_id int,
primary key (org_id, id)
);
CREATE TABLE accounts (
id serial,
name text,
state varchar(2),
size int,
org_id int,
primary key (org_id, id)
);
CREATE TABLE opportunity (
id serial,
name text,
amount int,
Warnings about multi-tenant implementations
• Danger ahead if using schemas on older PG versions
• Have to reinvent the wheel for even the basics
• Schema migrations
• Connection limits
• Think twice before using a schema or database per tenant
Click to edit master tile styleentity_id
Entity id
• What’s an entity id?
• Something granular
• Want to join where you can though
• Optimizing for parallelism and less for data in memory
Examples tell it best
• Web analytics
• Shard by visitor_id
• Shard both sessions and views
• Key is to co-locate things you’ll join on
Key considerations
• SQL will be more limited OR slow
• Think in terms of map reduce
Map reduce examples
• Count (*)
• SUM of 32 smaller count (*)
• Average
• SUM of 32 smaller SUM(foo) / SUM of 32 smaller count(*)
• Median
• uh….
But I like medians and more
• Count distinct
• HyperLogLog
• Ordered list approximation
• Top-n
• Median
• T-digest or HDR
Graph modelgraph model
When you use a graph database
• You’ll know, really you will
Very different approach
Craig
posted
photo
Daniel
liked
Will
posted
comment
But what about sharding?
• Within a graph model you’re going to duplicate your data
• Shard based on both:
• The objects themselves
• The objects subscribed to other objects
Read this
https://www.usenix.org/system/files/conference/atc13/atc13-bronson.pdf
time series
Time series: It’s obvious right?
• Well it depends
Querying long ranges
Not removing data
Always querying time
Querying a subset
Remove old data
Good Okay/Bad
Time series
• Range partitioning
• 2016 in a bucket, 2017 in a bucket
• 2016-01-01 in a bucket, 2016-01-02 in a bucket…
• Key steps
• Determine your ranges
• Make sure you setup enough in advance, or automate creating new ones
• Delete
Sensor data
CREATE TABLE measurement (
city_id int not null,
logdate date not null,
peaktemp int,
unitsales int
);
Sensor data - initial partition
CREATE TABLE measurement (
city_id int not null,
logdate date not null,
peaktemp int,
unitsales int
) PARTITION BY RANGE (logdate);
Sensor data - initial partition
CREATE TABLE measurement (
city_id int not null,
logdate date not null,
peaktemp int,
unitsales int
) PARTITION BY RANGE (logdate);
Sensor data - setting up partitions
CREATE TABLE measurement_y2017m10 PARTITION OF measurement
FOR VALUES FROM ('2017-10-01') TO ('2017-10-31');
CREATE TABLE measurement_y2017m11 PARTITION OF measurement
FOR VALUES FROM ('2017-11-01') TO ('2017-11-30');
Sensor data - indexing
CREATE TABLE measurement_y2017m10 PARTITION OF measurement
FOR VALUES FROM ('2017-10-01') TO ('2017-10-31');
CREATE TABLE measurement_y2017m11 PARTITION OF measurement
FOR VALUES FROM ('2017-11-01') TO ('2017-11-30');
CREATE INDEX ON measurement_y2017m10 (logdate);
CREATE INDEX ON measurement_y2017m11 (logdate);
Sensor data - inserting
CREATE TRIGGER insert_measurement_trigger
BEFORE INSERT ON measurement
FOR EACH ROW EXECUTE PROCEDURE measurement_insert_trigger();
Sensor data - inserting
CREATE OR REPLACE FUNCTION measurement_insert_trigger()
RETURNS TRIGGER AS $$
BEGIN
IF ( NEW.logdate >= DATE '2017-02-01' AND
NEW.logdate < DATE '2017-03-01' ) THEN
INSERT INTO measurement_y2017m02 VALUES (NEW.*);
ELSIF ( NEW.logdate >= DATE '2017-03-01' AND
NEW.logdate < DATE '2017-04-01' ) THEN
INSERT INTO measurement_y2017m03 VALUES (NEW.*);
...
ELSIF ( NEW.logdate >= DATE '2018-01-01' AND
NEW.logdate < DATE '2018-02-01' ) THEN
INSERT INTO measurement_y2018m01 VALUES (NEW.*);
ELSE
RAISE EXCEPTION 'Date out of range. Fix the measurement_insert_trigger() function!';
END IF;
RETURN NULL;
END;
$$
LANGUAGE plpgsql;
Five models
• Geography
• Multi-tenant
• Entity id
• Graph model
• Time series
Recap
• Not sharding is always easier than sharding
• Identify your sharding approach/key early, denormalize it even when
you’re small
• Don’t force it into one model. No model is perfect, but disqualify where
you can
• Sharding used to be much more painful, it’s not quite a party yet, but it’s
@citusdatawww.citusdata.com
Thanks
Craig Kerstiens
@craigkerstiens
https://2018.nordicpgday.org/feedback

Mais conteúdo relacionado

Mais procurados

The Challenges of Distributing Postgres: A Citus Story
The Challenges of Distributing Postgres: A Citus StoryThe Challenges of Distributing Postgres: A Citus Story
The Challenges of Distributing Postgres: A Citus Story
Hanna Kelman
 
Debugging Big Data Analytics in Apache Spark with BigDebug with Muhammad Gulz...
Debugging Big Data Analytics in Apache Spark with BigDebug with Muhammad Gulz...Debugging Big Data Analytics in Apache Spark with BigDebug with Muhammad Gulz...
Debugging Big Data Analytics in Apache Spark with BigDebug with Muhammad Gulz...
Databricks
 
Realtime Risk Management Using Kafka, Python, and Spark Streaming by Nick Evans
Realtime Risk Management Using Kafka, Python, and Spark Streaming by Nick EvansRealtime Risk Management Using Kafka, Python, and Spark Streaming by Nick Evans
Realtime Risk Management Using Kafka, Python, and Spark Streaming by Nick Evans
Spark Summit
 
OAP: Optimized Analytics Package for Spark Platform with Daoyuan Wang and Yua...
OAP: Optimized Analytics Package for Spark Platform with Daoyuan Wang and Yua...OAP: Optimized Analytics Package for Spark Platform with Daoyuan Wang and Yua...
OAP: Optimized Analytics Package for Spark Platform with Daoyuan Wang and Yua...
Databricks
 
A Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's RoadmapA Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's Roadmap
Itai Yaffe
 

Mais procurados (20)

The Challenges of Distributing Postgres: A Citus Story
The Challenges of Distributing Postgres: A Citus StoryThe Challenges of Distributing Postgres: A Citus Story
The Challenges of Distributing Postgres: A Citus Story
 
Using Postgres and Citus for Lightning Fast Analytics, also ft. Rollups | Liv...
Using Postgres and Citus for Lightning Fast Analytics, also ft. Rollups | Liv...Using Postgres and Citus for Lightning Fast Analytics, also ft. Rollups | Liv...
Using Postgres and Citus for Lightning Fast Analytics, also ft. Rollups | Liv...
 
Spark Summit - Stratio Streaming
Spark Summit - Stratio Streaming Spark Summit - Stratio Streaming
Spark Summit - Stratio Streaming
 
Debugging Big Data Analytics in Apache Spark with BigDebug with Muhammad Gulz...
Debugging Big Data Analytics in Apache Spark with BigDebug with Muhammad Gulz...Debugging Big Data Analytics in Apache Spark with BigDebug with Muhammad Gulz...
Debugging Big Data Analytics in Apache Spark with BigDebug with Muhammad Gulz...
 
Time series database, InfluxDB & PHP
Time series database, InfluxDB & PHPTime series database, InfluxDB & PHP
Time series database, InfluxDB & PHP
 
Realtime Risk Management Using Kafka, Python, and Spark Streaming by Nick Evans
Realtime Risk Management Using Kafka, Python, and Spark Streaming by Nick EvansRealtime Risk Management Using Kafka, Python, and Spark Streaming by Nick Evans
Realtime Risk Management Using Kafka, Python, and Spark Streaming by Nick Evans
 
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui MengChallenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
 
Data Analytics with Druid
Data Analytics with DruidData Analytics with Druid
Data Analytics with Druid
 
[Spark meetup] Spark Streaming Overview
[Spark meetup] Spark Streaming Overview[Spark meetup] Spark Streaming Overview
[Spark meetup] Spark Streaming Overview
 
Expanding Apache Spark Use Cases in 2.2 and Beyond with Matei Zaharia and dem...
Expanding Apache Spark Use Cases in 2.2 and Beyond with Matei Zaharia and dem...Expanding Apache Spark Use Cases in 2.2 and Beyond with Matei Zaharia and dem...
Expanding Apache Spark Use Cases in 2.2 and Beyond with Matei Zaharia and dem...
 
Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...
Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...
Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...
 
RUCK 2017 R에 날개 달기 - Microsoft R과 클라우드 머신러닝 소개
RUCK 2017 R에 날개 달기 - Microsoft R과 클라우드 머신러닝 소개RUCK 2017 R에 날개 달기 - Microsoft R과 클라우드 머신러닝 소개
RUCK 2017 R에 날개 달기 - Microsoft R과 클라우드 머신러닝 소개
 
OAP: Optimized Analytics Package for Spark Platform with Daoyuan Wang and Yua...
OAP: Optimized Analytics Package for Spark Platform with Daoyuan Wang and Yua...OAP: Optimized Analytics Package for Spark Platform with Daoyuan Wang and Yua...
OAP: Optimized Analytics Package for Spark Platform with Daoyuan Wang and Yua...
 
Real-time Analytics with Apache Flink and Druid
Real-time Analytics with Apache Flink and DruidReal-time Analytics with Apache Flink and Druid
Real-time Analytics with Apache Flink and Druid
 
ClickHouse Analytical DBMS: Introduction and Case Studies, by Alexander Zaitsev
ClickHouse Analytical DBMS: Introduction and Case Studies, by Alexander ZaitsevClickHouse Analytical DBMS: Introduction and Case Studies, by Alexander Zaitsev
ClickHouse Analytical DBMS: Introduction and Case Studies, by Alexander Zaitsev
 
Big data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real timeBig data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real time
 
Scaling Data Analytics Workloads on Databricks
Scaling Data Analytics Workloads on DatabricksScaling Data Analytics Workloads on Databricks
Scaling Data Analytics Workloads on Databricks
 
Omid: A Transactional Framework for HBase
Omid: A Transactional Framework for HBaseOmid: A Transactional Framework for HBase
Omid: A Transactional Framework for HBase
 
Fast Data with Apache Ignite and Apache Spark with Christos Erotocritou
Fast Data with Apache Ignite and Apache Spark with Christos ErotocritouFast Data with Apache Ignite and Apache Spark with Christos Erotocritou
Fast Data with Apache Ignite and Apache Spark with Christos Erotocritou
 
A Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's RoadmapA Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's Roadmap
 

Semelhante a Five data models for sharding and which is right | PGConf.ASIA 2018 | Craig Kerstiens

Sharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data LessonsSharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data Lessons
George Stathis
 
Data Modeling, Normalization, and Denormalisation | FOSDEM '19 | Dimitri Font...
Data Modeling, Normalization, and Denormalisation | FOSDEM '19 | Dimitri Font...Data Modeling, Normalization, and Denormalisation | FOSDEM '19 | Dimitri Font...
Data Modeling, Normalization, and Denormalisation | FOSDEM '19 | Dimitri Font...
Citus Data
 
High-Performance Advanced Analytics with Spark-Alchemy
High-Performance Advanced Analytics with Spark-AlchemyHigh-Performance Advanced Analytics with Spark-Alchemy
High-Performance Advanced Analytics with Spark-Alchemy
Databricks
 
Dimensional Modelling Session 2
Dimensional Modelling Session 2Dimensional Modelling Session 2
Dimensional Modelling Session 2
akitda
 

Semelhante a Five data models for sharding and which is right | PGConf.ASIA 2018 | Craig Kerstiens (20)

Postgres Vision 2018: Five Sharding Data Models
Postgres Vision 2018: Five Sharding Data ModelsPostgres Vision 2018: Five Sharding Data Models
Postgres Vision 2018: Five Sharding Data Models
 
Data_Modeling_MongoDB.pdf
Data_Modeling_MongoDB.pdfData_Modeling_MongoDB.pdf
Data_Modeling_MongoDB.pdf
 
What Your Database Query is Really Doing
What Your Database Query is Really DoingWhat Your Database Query is Really Doing
What Your Database Query is Really Doing
 
Sharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data LessonsSharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data Lessons
 
Data Modeling, Normalization, and Denormalisation | FOSDEM '19 | Dimitri Font...
Data Modeling, Normalization, and Denormalisation | FOSDEM '19 | Dimitri Font...Data Modeling, Normalization, and Denormalisation | FOSDEM '19 | Dimitri Font...
Data Modeling, Normalization, and Denormalisation | FOSDEM '19 | Dimitri Font...
 
High-Performance Advanced Analytics with Spark-Alchemy
High-Performance Advanced Analytics with Spark-AlchemyHigh-Performance Advanced Analytics with Spark-Alchemy
High-Performance Advanced Analytics with Spark-Alchemy
 
Advanced data modeling with apache cassandra
Advanced data modeling with apache cassandraAdvanced data modeling with apache cassandra
Advanced data modeling with apache cassandra
 
Follow the money with graphs
Follow the money with graphsFollow the money with graphs
Follow the money with graphs
 
Building better SQL Server Databases
Building better SQL Server DatabasesBuilding better SQL Server Databases
Building better SQL Server Databases
 
pandas: Powerful data analysis tools for Python
pandas: Powerful data analysis tools for Pythonpandas: Powerful data analysis tools for Python
pandas: Powerful data analysis tools for Python
 
Dimensional Modelling Session 2
Dimensional Modelling Session 2Dimensional Modelling Session 2
Dimensional Modelling Session 2
 
Data Vault 2.0: Using MD5 Hashes for Change Data Capture
Data Vault 2.0: Using MD5 Hashes for Change Data CaptureData Vault 2.0: Using MD5 Hashes for Change Data Capture
Data Vault 2.0: Using MD5 Hashes for Change Data Capture
 
MSBI and Data WareHouse techniques by Quontra
MSBI and Data WareHouse techniques by Quontra MSBI and Data WareHouse techniques by Quontra
MSBI and Data WareHouse techniques by Quontra
 
Graph Database Use Cases - StampedeCon 2015
Graph Database Use Cases - StampedeCon 2015Graph Database Use Cases - StampedeCon 2015
Graph Database Use Cases - StampedeCon 2015
 
Graph database Use Cases
Graph database Use CasesGraph database Use Cases
Graph database Use Cases
 
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
 
History of NoSQL and Azure Documentdb feature set
History of NoSQL and Azure Documentdb feature setHistory of NoSQL and Azure Documentdb feature set
History of NoSQL and Azure Documentdb feature set
 
Real time streaming analytics
Real time streaming analyticsReal time streaming analytics
Real time streaming analytics
 
Paradigmas de procesamiento en Big Data: estado actual, tendencias y oportu...
Paradigmas de procesamiento en  Big Data: estado actual,  tendencias y oportu...Paradigmas de procesamiento en  Big Data: estado actual,  tendencias y oportu...
Paradigmas de procesamiento en Big Data: estado actual, tendencias y oportu...
 
At the core you will have KUSTO
At the core you will have KUSTOAt the core you will have KUSTO
At the core you will have KUSTO
 

Mais de Citus Data

Why developers need marketing now more than ever | GlueCon 2019 | Claire Gior...
Why developers need marketing now more than ever | GlueCon 2019 | Claire Gior...Why developers need marketing now more than ever | GlueCon 2019 | Claire Gior...
Why developers need marketing now more than ever | GlueCon 2019 | Claire Gior...
Citus Data
 

Mais de Citus Data (20)

Data Modeling, Normalization, and De-Normalization | PostgresOpen 2019 | Dimi...
Data Modeling, Normalization, and De-Normalization | PostgresOpen 2019 | Dimi...Data Modeling, Normalization, and De-Normalization | PostgresOpen 2019 | Dimi...
Data Modeling, Normalization, and De-Normalization | PostgresOpen 2019 | Dimi...
 
JSONB Tricks: Operators, Indexes, and When (Not) to Use It | PostgresOpen 201...
JSONB Tricks: Operators, Indexes, and When (Not) to Use It | PostgresOpen 201...JSONB Tricks: Operators, Indexes, and When (Not) to Use It | PostgresOpen 201...
JSONB Tricks: Operators, Indexes, and When (Not) to Use It | PostgresOpen 201...
 
Tutorial: Implementing your first Postgres extension | PGConf EU 2019 | Burak...
Tutorial: Implementing your first Postgres extension | PGConf EU 2019 | Burak...Tutorial: Implementing your first Postgres extension | PGConf EU 2019 | Burak...
Tutorial: Implementing your first Postgres extension | PGConf EU 2019 | Burak...
 
Whats wrong with postgres | PGConf EU 2019 | Craig Kerstiens
Whats wrong with postgres | PGConf EU 2019 | Craig KerstiensWhats wrong with postgres | PGConf EU 2019 | Craig Kerstiens
Whats wrong with postgres | PGConf EU 2019 | Craig Kerstiens
 
When it all goes wrong | PGConf EU 2019 | Will Leinweber
When it all goes wrong | PGConf EU 2019 | Will LeinweberWhen it all goes wrong | PGConf EU 2019 | Will Leinweber
When it all goes wrong | PGConf EU 2019 | Will Leinweber
 
Amazing SQL your ORM can (or can't) do | PGConf EU 2019 | Louise Grandjonc
Amazing SQL your ORM can (or can't) do | PGConf EU 2019 | Louise GrandjoncAmazing SQL your ORM can (or can't) do | PGConf EU 2019 | Louise Grandjonc
Amazing SQL your ORM can (or can't) do | PGConf EU 2019 | Louise Grandjonc
 
What Microsoft is doing with Postgres & the Citus Data acquisition | PGConf E...
What Microsoft is doing with Postgres & the Citus Data acquisition | PGConf E...What Microsoft is doing with Postgres & the Citus Data acquisition | PGConf E...
What Microsoft is doing with Postgres & the Citus Data acquisition | PGConf E...
 
Deep Postgres Extensions in Rust | PGCon 2019 | Jeff Davis
Deep Postgres Extensions in Rust | PGCon 2019 | Jeff DavisDeep Postgres Extensions in Rust | PGCon 2019 | Jeff Davis
Deep Postgres Extensions in Rust | PGCon 2019 | Jeff Davis
 
Why Postgres Why This Database Why Now | SF Bay Area Postgres Meetup | Claire...
Why Postgres Why This Database Why Now | SF Bay Area Postgres Meetup | Claire...Why Postgres Why This Database Why Now | SF Bay Area Postgres Meetup | Claire...
Why Postgres Why This Database Why Now | SF Bay Area Postgres Meetup | Claire...
 
A story on Postgres index types | PostgresLondon 2019 | Louise Grandjonc
A story on Postgres index types | PostgresLondon 2019 | Louise GrandjoncA story on Postgres index types | PostgresLondon 2019 | Louise Grandjonc
A story on Postgres index types | PostgresLondon 2019 | Louise Grandjonc
 
Why developers need marketing now more than ever | GlueCon 2019 | Claire Gior...
Why developers need marketing now more than ever | GlueCon 2019 | Claire Gior...Why developers need marketing now more than ever | GlueCon 2019 | Claire Gior...
Why developers need marketing now more than ever | GlueCon 2019 | Claire Gior...
 
The Art of PostgreSQL | PostgreSQL Ukraine | Dimitri Fontaine
The Art of PostgreSQL | PostgreSQL Ukraine | Dimitri FontaineThe Art of PostgreSQL | PostgreSQL Ukraine | Dimitri Fontaine
The Art of PostgreSQL | PostgreSQL Ukraine | Dimitri Fontaine
 
Optimizing your app by understanding your Postgres | RailsConf 2019 | Samay S...
Optimizing your app by understanding your Postgres | RailsConf 2019 | Samay S...Optimizing your app by understanding your Postgres | RailsConf 2019 | Samay S...
Optimizing your app by understanding your Postgres | RailsConf 2019 | Samay S...
 
When it all goes wrong (with Postgres) | RailsConf 2019 | Will Leinweber
When it all goes wrong (with Postgres) | RailsConf 2019 | Will LeinweberWhen it all goes wrong (with Postgres) | RailsConf 2019 | Will Leinweber
When it all goes wrong (with Postgres) | RailsConf 2019 | Will Leinweber
 
The Art of PostgreSQL | PostgreSQL Ukraine Meetup | Dimitri Fontaine
The Art of PostgreSQL | PostgreSQL Ukraine Meetup | Dimitri FontaineThe Art of PostgreSQL | PostgreSQL Ukraine Meetup | Dimitri Fontaine
The Art of PostgreSQL | PostgreSQL Ukraine Meetup | Dimitri Fontaine
 
How to write SQL queries | pgDay Paris 2019 | Dimitri Fontaine
How to write SQL queries | pgDay Paris 2019 | Dimitri FontaineHow to write SQL queries | pgDay Paris 2019 | Dimitri Fontaine
How to write SQL queries | pgDay Paris 2019 | Dimitri Fontaine
 
When it all Goes Wrong |Nordic PGDay 2019 | Will Leinweber
When it all Goes Wrong |Nordic PGDay 2019 | Will LeinweberWhen it all Goes Wrong |Nordic PGDay 2019 | Will Leinweber
When it all Goes Wrong |Nordic PGDay 2019 | Will Leinweber
 
Why PostgreSQL Why This Database Why Now | Nordic PGDay 2019 | Claire Giordano
Why PostgreSQL Why This Database Why Now | Nordic PGDay 2019 | Claire GiordanoWhy PostgreSQL Why This Database Why Now | Nordic PGDay 2019 | Claire Giordano
Why PostgreSQL Why This Database Why Now | Nordic PGDay 2019 | Claire Giordano
 
Scaling Multi-Tenant Applications Using the Django ORM & Postgres | PyCaribbe...
Scaling Multi-Tenant Applications Using the Django ORM & Postgres | PyCaribbe...Scaling Multi-Tenant Applications Using the Django ORM & Postgres | PyCaribbe...
Scaling Multi-Tenant Applications Using the Django ORM & Postgres | PyCaribbe...
 
Monitoring Postgres at Scale | PGConf.ASIA 2018 | Lukas Fittl
Monitoring Postgres at Scale | PGConf.ASIA 2018 | Lukas FittlMonitoring Postgres at Scale | PGConf.ASIA 2018 | Lukas Fittl
Monitoring Postgres at Scale | PGConf.ASIA 2018 | Lukas Fittl
 

Último

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Último (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 

Five data models for sharding and which is right | PGConf.ASIA 2018 | Craig Kerstiens

  • 1. Five data models for sharding Craig Kerstiens, Head of Cloud at Citus
  • 2. Who am I? Craig Kerstiens http://www.craigkerstiens.com @craigkerstiens Postgres weekly Run Citus Cloud Co-chair PostgresOpen
  • 3. What is sharding Practice of separating a large database into smaller, faster, more easily managed parts called data shards. Source: the internet
  • 5. A table A schema A PG database A node An instance A PG cluster Is it a shard? NO YES
  • 6. 2 Nodes - 32 shards
  • 7. 4 nodes - still 32 shards
  • 8. Five models • Geography • Multi-tenant • Entity id • Graph model • Time series
  • 9. But first, two approaches
  • 11. Hash - The steps 1. Hash your id 2. Define a shard range x shards, and each contain some range of hash values. Route all inserts/updates/deletes to the shard 3. Profit
  • 12. More details • Hash based on some id • Postgres internal hash can work fine, or so can your own • Define your number of shards up front, make this larger than you expect to grow to in terms of nodes • (2 is bad) • (2 million is also bad) • Factors of 2 are nice, but not actually required
  • 13. Don’t just route values • 1-10 -> shard 1 • 2-20 -> shard 2
  • 14. Create range of hash values • hash 1 = 46154 • hash 2 = 27193 • Shard 13 = ranges 26624 to 28672
  • 15. Range - The steps 1. Ensure you’ve created your new destination for your range 2. Route your range to the right bucket 3. Profit
  • 17. Five models • Geography • Multi-tenant • Entity id • Graph model • Time series
  • 18. Click to edit master tile stylegeography
  • 19. Shard by Geography • Is there a clear line I can draw for a geographical boundary • Good examples: income by state, healthcare, etc. • Bad examples: • Text messages: 256 sends to 510, both want a copy of this data…
  • 20. Will geography sharding work for you? • Do you join across geographies? • Does data easily cross boundaries? • Is data queries across boundaries or a different access frequently?
  • 21. More specifics • Granular vs. broad • State vs. zip code • (California and texas are bad) • Zip codes might work, but does that work for your app?
  • 22. Common use cases • If your go to market is geography focused • Instacart/Shipt • Uber/Lyft
  • 23. Real world application • Range sharding makes moving things around harder here • Combining the geography and giving each and id, then hashing (but using smaller set of shards) can give better balance to your data
  • 25. Sharding by tenant • Is each customer’s data their own? • What’s your data’s distribution? • (If one tenant/customer is 50% of your data tenant sharding won’t help) • If it’s 10% of your data you may be okay
  • 26. Common use cases • Saas/B2B • Salesforce • Marketing automation • Any online SaaS
  • 27. Guidelines for multi-tenant sharding • Put your tenant_id on every table that’s relevant • Yes, denormalize • Ensure primary keys and foreign keys are composite ones (with tenant_id) • Enforce your tenant_id is on all queries so things are appropriately scoped
  • 28. Salesforce schema CREATE TABLE leads ( id serial primary key, first_name text, last_name text, email text ); CREATE TABLE accounts ( id serial primary key, name text, state varchar(2), size int ); CREATE TABLE opportunity ( id serial primary key, name text, amount int );
  • 29. Salesforce schema - with orgs CREATE TABLE leads ( id serial primary key, first_name text, last_name text, email text, org_id int ); CREATE TABLE accounts ( id serial primary key, name text, state varchar(2), size int org_id int ); CREATE TABLE opportunity ( id serial primary key, name text, amount int org_id int );
  • 30. Salesforce schema - with orgs CREATE TABLE leads ( id serial primary key, first_name text, last_name text, email text, org_id int ); CREATE TABLE accounts ( id serial primary key, name text, state varchar(2), size int org_id int ); CREATE TABLE opportunity ( id serial primary key, name text, amount int org_id int );
  • 31. Salesforce schema - with keys CREATE TABLE leads ( id serial, first_name text, last_name text, email text, org_id int, primary key (org_id, id) ); CREATE TABLE accounts ( id serial, name text, state varchar(2), size int, org_id int, primary key (org_id, id) ); CREATE TABLE opportunity ( id serial, name text, amount int,
  • 32. Salesforce schema - with keys CREATE TABLE leads ( id serial, first_name text, last_name text, email text, org_id int, primary key (org_id, id) ); CREATE TABLE accounts ( id serial, name text, state varchar(2), size int, org_id int, primary key (org_id, id) ); CREATE TABLE opportunity ( id serial, name text, amount int,
  • 33. Warnings about multi-tenant implementations • Danger ahead if using schemas on older PG versions • Have to reinvent the wheel for even the basics • Schema migrations • Connection limits • Think twice before using a schema or database per tenant
  • 34. Click to edit master tile styleentity_id
  • 35. Entity id • What’s an entity id? • Something granular • Want to join where you can though • Optimizing for parallelism and less for data in memory
  • 36. Examples tell it best • Web analytics • Shard by visitor_id • Shard both sessions and views • Key is to co-locate things you’ll join on
  • 37. Key considerations • SQL will be more limited OR slow • Think in terms of map reduce
  • 38. Map reduce examples • Count (*) • SUM of 32 smaller count (*) • Average • SUM of 32 smaller SUM(foo) / SUM of 32 smaller count(*) • Median • uh….
  • 39. But I like medians and more • Count distinct • HyperLogLog • Ordered list approximation • Top-n • Median • T-digest or HDR
  • 41. When you use a graph database • You’ll know, really you will
  • 43. But what about sharding? • Within a graph model you’re going to duplicate your data • Shard based on both: • The objects themselves • The objects subscribed to other objects
  • 46. Time series: It’s obvious right? • Well it depends
  • 47. Querying long ranges Not removing data Always querying time Querying a subset Remove old data Good Okay/Bad
  • 48. Time series • Range partitioning • 2016 in a bucket, 2017 in a bucket • 2016-01-01 in a bucket, 2016-01-02 in a bucket… • Key steps • Determine your ranges • Make sure you setup enough in advance, or automate creating new ones • Delete
  • 49. Sensor data CREATE TABLE measurement ( city_id int not null, logdate date not null, peaktemp int, unitsales int );
  • 50. Sensor data - initial partition CREATE TABLE measurement ( city_id int not null, logdate date not null, peaktemp int, unitsales int ) PARTITION BY RANGE (logdate);
  • 51. Sensor data - initial partition CREATE TABLE measurement ( city_id int not null, logdate date not null, peaktemp int, unitsales int ) PARTITION BY RANGE (logdate);
  • 52. Sensor data - setting up partitions CREATE TABLE measurement_y2017m10 PARTITION OF measurement FOR VALUES FROM ('2017-10-01') TO ('2017-10-31'); CREATE TABLE measurement_y2017m11 PARTITION OF measurement FOR VALUES FROM ('2017-11-01') TO ('2017-11-30');
  • 53. Sensor data - indexing CREATE TABLE measurement_y2017m10 PARTITION OF measurement FOR VALUES FROM ('2017-10-01') TO ('2017-10-31'); CREATE TABLE measurement_y2017m11 PARTITION OF measurement FOR VALUES FROM ('2017-11-01') TO ('2017-11-30'); CREATE INDEX ON measurement_y2017m10 (logdate); CREATE INDEX ON measurement_y2017m11 (logdate);
  • 54. Sensor data - inserting CREATE TRIGGER insert_measurement_trigger BEFORE INSERT ON measurement FOR EACH ROW EXECUTE PROCEDURE measurement_insert_trigger();
  • 55. Sensor data - inserting CREATE OR REPLACE FUNCTION measurement_insert_trigger() RETURNS TRIGGER AS $$ BEGIN IF ( NEW.logdate >= DATE '2017-02-01' AND NEW.logdate < DATE '2017-03-01' ) THEN INSERT INTO measurement_y2017m02 VALUES (NEW.*); ELSIF ( NEW.logdate >= DATE '2017-03-01' AND NEW.logdate < DATE '2017-04-01' ) THEN INSERT INTO measurement_y2017m03 VALUES (NEW.*); ... ELSIF ( NEW.logdate >= DATE '2018-01-01' AND NEW.logdate < DATE '2018-02-01' ) THEN INSERT INTO measurement_y2018m01 VALUES (NEW.*); ELSE RAISE EXCEPTION 'Date out of range. Fix the measurement_insert_trigger() function!'; END IF; RETURN NULL; END; $$ LANGUAGE plpgsql;
  • 56. Five models • Geography • Multi-tenant • Entity id • Graph model • Time series
  • 57. Recap • Not sharding is always easier than sharding • Identify your sharding approach/key early, denormalize it even when you’re small • Don’t force it into one model. No model is perfect, but disqualify where you can • Sharding used to be much more painful, it’s not quite a party yet, but it’s