SlideShare uma empresa Scribd logo
1 de 67
Sumedh Pathak, Co-Founder & VP Engineering, Citus Data
QCon London 2018
The Future of Distributed
Databases is Relational
@moss_toss | @citusdata
InfoQ.com: News & Community Site
Watch the video with slide
synchronization on InfoQ.com!
https://www.infoq.com/presentations/
future-distributed-database-relational
• Over 1,000,000 software developers, architects and CTOs read the site world-
wide every month
• 250,000 senior developers subscribe to our weekly newsletter
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• 2 dedicated podcast channels: The InfoQ Podcast, with a focus on
Architecture and The Engineering Culture Podcast, with a focus on building
• 96 deep dives on innovative topics packed as downloadable emags and
minibooks
• Over 40 new content items per week
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
Presented at QCon London
www.qconlondon.com
About Me
• Co-Founder & VP
Engineering at Citus Data
• Amazon Shopping Cart
(former)
• Amazon Supply Chain &
Order Fulfillment (former)
• Stanford Computer Science
Citus Data co-founders, left-to-right
Ozgun Erdogan, Sumedh Pathak, & Umur Cubukcu
Photo cred: Willy Johnson, Monterey CA, Sep 2017
Sumedh Pathak | Citus Data | QCon London 2018
Why
RDBMS?
Sumedh Pathak | Citus Data | QCon London 2018
Streaming Storage
Map
Reduce
NoSQL
SQL
Database
SELECT ...
Application
Application
Sumedh Pathak | Citus Data | QCon London 2018
Because your architecture could be simpler
An RDBMS is a
general-purpose
data platform
Sumedh Pathak | Citus Data | QCon London 2018
Fast writes
Real-time & bulk
High throughput
High concurrency
Data consistency
Query optimizers
PostgreSQL
MySQL
MongoDB
SQL Server +
Oracle
Source: Hacker News, https://news.ycombinator.com
Startups Are Choosing Postgres% database job posts mentioning each database, across 20K+ job posts
Sumedh Pathak | Citus Data | QCon London 2018
but RDBMS’s
don’t scale, right?
Sumedh Pathak | Citus Data | QCon London 2018
but RDBMS’s don’t scale
RDBMS’s are hard to scale
Sumedh Pathak | Citus Data | QCon London 2018
What exactly needs to Scale?
- Tables (Data)
- Partitioning, Co-location, Reference Tables
- SQL (Reads)
- How do we express and optimize distributed SQL
- Transactions (Writes)
- Cross Shard updates/deletes, Global Atomic Transactions
Sumedh Pathak | Citus Data | QCon London 2018
1
2
3
Scaling
Tables
Sumedh Pathak | Citus Data | QCon London 2018
How to partition the data?
- Pick a column
- Date
- Id (customer_id, cart_id)
- Pick a method
- Hash
- Range
Partition data across nodes
R1
R2
R3
R4
R5
R6
R7
Coordinator
Node
Worker
Nodes
Shards
Sumedh Pathak | Citus Data | QCon London 2018
Worker → RDBMS, Shard → Table
Reference Tables
N1
N1
N1
N1
Copies of same table
Sumedh Pathak | Citus Data | QCon London 2018
Coordinator
Node
Worker
Nodes
Co-Location
R1
R2
R3
R4
S1
S2
S3
S4
Explicit Co-Location API.
E.g. Partition by Tenant
Sumedh Pathak | Citus Data | QCon London 2018
Coordinator
Node
Worker
Nodes
What about Foreign Keys?
The key to scaling tables...
- Use relational databases as a building block
- Understand semantics of application—to be smart about
partitioning
- Multi-tenant applications
Scaling
SQL
FROM table R
SELECT x Projectx
(R) → R’
WHERE f(x) Filterf(x)
(R) → R’
… JOIN … R × S → R’
SQL ↔ Relational Algebra
Sumedh Pathak | Citus Data | QCon London 2018
FROM sharded_table Collect(R1
,R2
,...) → R
Distributed Relational Algebra
R1
R2
C
Sumedh Pathak | Citus Data | QCon London 2018
Projectx
(Collect(R1
,R2
,...)) = Collect(Projectx
(R1
), Projectx
(R2
)...)
Commutative property
R1
R2
C
R1
R2
C
Px
Px
Px
Sumedh Pathak | Citus Data | QCon London 2018
Collect(R1
,R2
,...) x Collect(S1
,S2
,...) = Collect(R1
× S1
,R2
× S2,
...)
Distributive property
R1
R2
C
×
C
S1
S2
R1
R2
C
×
S1
S2
×
X = Join Operator
SUM(x)(Collect(R1
,R2
,...)) = SUM(Collect(SUM(R1
), SUM(R2
)...))
Associative property
R1
R2
C
R1
R2
C
Sumx
Sumx
Sumx
Sumedh Pathak | Citus Data | QCon London 2018
Sumx
SELECT sum(price) FROM orders, nation
WHERE orders.nation = nation.name AND
orders.date >= '2012-01-01' AND
nation.region = 'Asia';
Sumedh Pathak | Citus Data | QCon London 2018
Sumedh Pathak | Citus Data | QCon London 2018
Sumedh Pathak | Citus Data | QCon London 2018
Volcano style processing
Data flows
from
bottom
to top
Sumedh Pathak | Citus Data | QCon London 2018
Sumedh Pathak | Citus Data | QCon London 2018
Sumedh Pathak | Citus Data | QCon London 2018
Parallelize
Aggregate
Push Joins & Filters
below collect. Run in
parallel across all
nodes
Filters & Projections
done before Join
SELECT sum(intermediate_col)
FROM <concatenated results>;
SELECT sum(price)
FROM orders_2 JOIN nation_2
ON (orders_2.name = nation_2.name)
WHERE
orders_2.date >= '2017-01-01'
AND
nation_2.region = 'Asia';
SELECT sum(price)
FROM orders_2 JOIN nation_1
ON (orders_2.name = nation_1.name)
WHERE
orders_2.date >= '2017-01-01'
AND
nation_2.region = 'Asia';
Sumedh Pathak | Citus Data | QCon London 2018
Executing Distributed SQL
SQL database
orders_1
nation_1
orders
nation
SELECT sum(price)
FROM orders_2 o JOIN nation_1
ON (o.name = n.name)
WHERE o.date >= '2017-01-01'
AND n.region = 'Asia';
SELECT sum(price)
FROM orders_1 o JOIN nation_1
ON (o.name = n.name)
WHERE o.date >= '2017-01-01'
AND n.region = 'Asia';
SELECT sum(price) FROM <results>;
SQL database
orders_2
nation_1
The key to scaling SQL...
- New relational algebra operators for
distributed processing
- Relational Algebra Properties to optimize
tree: Commutativity, Associativity, &
Distributivity
- Map / Reduce operators
Scaling
Transactions
Money Transfer, as an example
BEGIN;
UPDATE accounts SET balance = balance -
100 WHERE id = ‘ALICE’;
UPDATE accounts SET balance = balance +
100 WHERE id = ‘BOB’;
COMMIT;
Sumedh Pathak | Citus Data | QCon London 2018
A1
A2
BEGIN;
UPDATE accounts SET balance = balance -
100 WHERE id = ‘ALICE’;
UPDATE accounts SET balance = balance +
100 WHERE id = ‘BOB’;
COMMIT;
A3
A4
Coordinator
Sumedh Pathak | Citus Data | QCon London 2018
A1
A2
BEGIN;
UPDATE accounts SET balance = balance -
100 WHERE id = ‘ALICE’;
UPDATE accounts SET balance = balance +
100 WHERE id = ‘BOB’;
COMMIT;
A3
A4
BEGIN;
UPDATE accounts SET balance = balance -
100 WHERE id = ‘ALICE’;
Sumedh Pathak | Citus Data | QCon London 2018
Coordinator
A1
A2
BEGIN;
UPDATE accounts SET balance = balance -
100 WHERE id = ‘ALICE’;
UPDATE accounts SET balance = balance +
100 WHERE id = ‘BOB’;
COMMIT;
A3
A4
BEGIN;
UPDATE accounts SET balance = balance -
100 WHERE id = ‘ALICE’;
COMMIT;
Sumedh Pathak | Citus Data | QCon London 2018
Coordinator
A1
A2
BEGIN;
UPDATE accounts SET balance = balance -
100 WHERE id = ‘ALICE’;
UPDATE accounts SET balance = balance +
100 WHERE id = ‘BOB’;
COMMIT;
A3
A4
Coordinator
BEGIN;
UPDATE accounts SET balance =
balance + 100 WHERE id = ‘BOB’;
Sumedh Pathak | Citus Data | QCon London 2018
BEGIN;
UPDATE accounts SET balance = balance
- 100 WHERE id = ‘ALICE’;
A1
A2
BEGIN;
UPDATE accounts SET balance = balance -
100 WHERE id = ‘ALICE’;
UPDATE accounts SET balance = balance +
100 WHERE id = ‘BOB’;
COMMIT;
A3
A4
BEGIN;
UPDATE accounts SET balance =
balance + 100 WHERE id = ‘BOB’;
COMMIT
Sumedh Pathak | Citus Data | QCon London 2018
BEGIN;
UPDATE accounts SET balance = balance
- 100 WHERE id = ‘ALICE’;
Coordinator
A1
A2
BEGIN;
UPDATE accounts SET balance = balance -
100 WHERE id = ‘ALICE’;
UPDATE accounts SET balance = balance +
100 WHERE id = ‘BOB’;
COMMIT;
A3
A4
PREPARE TRANSACTION ‘citus_...98’; PREPARE TRANSACTION ‘citus_...98’;
Sumedh Pathak | Citus Data | QCon London 2018
Coordinator
What happens during PREPARE?
State of transaction stored
on a durable store
Locks are
maintained
&
A1
A2
BEGIN;
UPDATE accounts SET balance = balance -
100 WHERE id = ‘ALICE’;
UPDATE accounts SET balance = balance +
100 WHERE id = ‘BOB’;
COMMIT;
A3
A4
Coordinator
PREPARE TRANSACTION ‘citus_...98’; ROLLBACK TRANSACTION ‘citus_...98’;
Sumedh Pathak | Citus Data | QCon London 2018
A1
A2
BEGIN;
UPDATE accounts SET balance = balance -
100 WHERE id = ‘ALICE’;
UPDATE accounts SET balance = balance +
100 WHERE id = ‘BOB’;
COMMIT;
A3
A4
PREPARE TRANSACTION ‘citus_...98’; PREPARE TRANSACTION ‘citus_...98’;
Sumedh Pathak | Citus Data | QCon London 2018
Coordinator
A1
A2
BEGIN;
UPDATE accounts SET balance = balance -
100 WHERE id = ‘ALICE’;
UPDATE accounts SET balance = balance +
100 WHERE id = ‘BOB’;
COMMIT;
A3
A4
COMMIT PREPARED ‘citus_...98’; COMMIT PREPARED ‘citus_...98’;
Sumedh Pathak | Citus Data | QCon London 2018
Coordinator
A1
A2
BEGIN;
UPDATE accounts SET balance = balance -
100 WHERE id = ‘ALICE’;
UPDATE accounts SET balance = balance +
100 WHERE id = ‘BOB’;
COMMIT;
A3
A4
COMMIT PREPARED ‘citus_...98’; COMMIT PREPARED ‘citus_...98’;
Sumedh Pathak | Citus Data | QCon London 2018
Coordinator
A1
A2
BEGIN;
UPDATE accounts SET balance = balance -
100 WHERE id = ‘ALICE’;
UPDATE accounts SET balance = balance +
100 WHERE id = ‘BOB’;
COMMIT;
A3
A4
COMMIT PREPARED ‘citus_...98’; COMMIT PREPARED ‘citus_...98’;
Sumedh Pathak | Citus Data | QCon London 2018
Coordinator
But....
Deadlocks!
Example
// SESSION 1
BEGIN;
UPDATE accounts SET balance
= balance - 100 WHERE id =
‘ALICE’;
(LOCK on ROW with id
‘ALICE’)
// SESSION 2
Sumedh Pathak | Citus Data | QCon London 2018
Example
// SESSION 1
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE
id = ‘ALICE’;
(LOCK on ROW with id ‘ALICE’)
// SESSION 2
BEGIN;
UPDATE accounts SET balance
= balance + 100 WHERE id =
‘BOB’;
(LOCK on ROW with id ‘BOB’)
Sumedh Pathak | Citus Data | QCon London 2018
Example
// SESSION 1
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE
id = ‘ALICE’;
(LOCK on ROW with id ‘ALICE’)
UPDATE accounts SET balance
= balance + 100 WHERE id =
‘BOB’;
// SESSION 2
BEGIN;
UPDATE accounts SET balance = balance + 100 WHERE
id = ‘BOB’;
(LOCK on ROW with id ‘BOB’)
Sumedh Pathak | Citus Data | QCon London 2018
Example
// SESSION 1
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE
id = ‘ALICE’;
(LOCK on ROW with id ‘ALICE’)
UPDATE accounts SET balance = balance + 100
WHERE id = ‘BOB’;
// SESSION 2
BEGIN;
UPDATE accounts SET balance = balance + 100 WHERE
id = ‘BOB’;
(LOCK on ROW with id ‘BOB’)
UPDATE accounts SET balance
= balance - 100 WHERE id =
‘ALICE’
Sumedh Pathak | Citus Data | QCon London 2018
How do Relational DB’s solve this?
S1
S2
- Construct a
Directed Graph
- Each node is a
session/transaction
- Edge represents a
wait on a lock
Waiting
on ‘Bob’
Waiting on
‘Alice’
S1
S2
S1 S2
S2 S1
S2 Waits on S1 S1 Waits on S2
Sumedh Pathak | Citus Data | QCon London 2018
S1
S2
S1 S2
S2 S1
S2 Waits on S1 S1 Waits on S2
Distributed
Deadlock
Detector
S1
S2 S2
S1
Sumedh Pathak | Citus Data | QCon London 2018
keys to scaling transactions
- 2PC to ensure atomic transactions across nodes
- Deadlock Detection—to scale complex & concurrent
transaction workloads
- MVCC
- Failure Handling
4
It’s 2018. Distributed can be Relational
- Scale tables—via sharding
- Scale SQL—via distributed relational algebra
- Scale transactions—via 2PC & Deadlock Detection
Sumedh Pathak | Citus Data | QCon London 2018
Now, how do we implement all of this?
Postgres
Sumedh Pathak | Citus Data | QCon London 2018
PostgreSQL
Planner
Executor
Custom scan
Commit / abort
Extension (.so)
Access methods
Foreign tables
Functions
...
...
...
...
...
...
...
CREATE EXTENSION ...
PostgreSQL PostgreSQL
PostgreSQL
shardsshardsshard
shardsshardsshard
SELECT … FROM distributed_table …
SELECT … FROM shard… SELECT … FROM shard…
Citus
PostgreSQL
Citus
PostGIS PL/Python
JSONB
2PC
Replication...
Sequences
Indexes
Full-Text SearchTransactions
…
dblink
Sumedh Pathak | Citus Data | QCon London 2018
Rich
ecosystem of
tooling, data
types,
& extensions
in
PostgreSQL
PostgreSQL can be
extended into an all-purpose
distributed RDBMS
I believe the future of distributed
databases is relational.
Thank you!
Sumedh Pathak
sumedh@citusdata.com
@moss_toss | @citusdata | citusdata.com
QCon London 2018 | The Future of Distributed Databases is Relational
Watch the video with slide synchronization on
InfoQ.com!
https://www.infoq.com/presentations/future-
distributed-database-relational

Mais conteúdo relacionado

Mais de C4Media

Mais de C4Media (20)

Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CD
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine Learning
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at Speed
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep Systems
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.js
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly Compiler
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix Scale
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's Edge
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home Everywhere
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing For
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
 
Navigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery TeamsNavigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery Teams
 
High Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in AdtechHigh Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in Adtech
 
Rust's Journey to Async/await
Rust's Journey to Async/awaitRust's Journey to Async/await
Rust's Journey to Async/await
 
Opportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven UtopiaOpportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven Utopia
 
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
 
Are We Really Cloud-Native?
Are We Really Cloud-Native?Are We Really Cloud-Native?
Are We Really Cloud-Native?
 
CockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL DatabaseCockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL Database
 
A Dive into Streams @LinkedIn with Brooklin
A Dive into Streams @LinkedIn with BrooklinA Dive into Streams @LinkedIn with Brooklin
A Dive into Streams @LinkedIn with Brooklin
 

Último

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Último (20)

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 

The Future of Distributed Databases Is Relational

  • 1. Sumedh Pathak, Co-Founder & VP Engineering, Citus Data QCon London 2018 The Future of Distributed Databases is Relational @moss_toss | @citusdata
  • 2. InfoQ.com: News & Community Site Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ future-distributed-database-relational • Over 1,000,000 software developers, architects and CTOs read the site world- wide every month • 250,000 senior developers subscribe to our weekly newsletter • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • 2 dedicated podcast channels: The InfoQ Podcast, with a focus on Architecture and The Engineering Culture Podcast, with a focus on building • 96 deep dives on innovative topics packed as downloadable emags and minibooks • Over 40 new content items per week
  • 3. Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide Presented at QCon London www.qconlondon.com
  • 4. About Me • Co-Founder & VP Engineering at Citus Data • Amazon Shopping Cart (former) • Amazon Supply Chain & Order Fulfillment (former) • Stanford Computer Science Citus Data co-founders, left-to-right Ozgun Erdogan, Sumedh Pathak, & Umur Cubukcu Photo cred: Willy Johnson, Monterey CA, Sep 2017 Sumedh Pathak | Citus Data | QCon London 2018
  • 5.
  • 6. Why RDBMS? Sumedh Pathak | Citus Data | QCon London 2018
  • 7. Streaming Storage Map Reduce NoSQL SQL Database SELECT ... Application Application Sumedh Pathak | Citus Data | QCon London 2018 Because your architecture could be simpler
  • 8. An RDBMS is a general-purpose data platform Sumedh Pathak | Citus Data | QCon London 2018 Fast writes Real-time & bulk High throughput High concurrency Data consistency Query optimizers
  • 9. PostgreSQL MySQL MongoDB SQL Server + Oracle Source: Hacker News, https://news.ycombinator.com Startups Are Choosing Postgres% database job posts mentioning each database, across 20K+ job posts Sumedh Pathak | Citus Data | QCon London 2018
  • 10. but RDBMS’s don’t scale, right? Sumedh Pathak | Citus Data | QCon London 2018
  • 11. but RDBMS’s don’t scale RDBMS’s are hard to scale Sumedh Pathak | Citus Data | QCon London 2018
  • 12. What exactly needs to Scale? - Tables (Data) - Partitioning, Co-location, Reference Tables - SQL (Reads) - How do we express and optimize distributed SQL - Transactions (Writes) - Cross Shard updates/deletes, Global Atomic Transactions Sumedh Pathak | Citus Data | QCon London 2018 1 2 3
  • 13. Scaling Tables Sumedh Pathak | Citus Data | QCon London 2018
  • 14. How to partition the data? - Pick a column - Date - Id (customer_id, cart_id) - Pick a method - Hash - Range
  • 15. Partition data across nodes R1 R2 R3 R4 R5 R6 R7 Coordinator Node Worker Nodes Shards Sumedh Pathak | Citus Data | QCon London 2018
  • 16. Worker → RDBMS, Shard → Table
  • 17. Reference Tables N1 N1 N1 N1 Copies of same table Sumedh Pathak | Citus Data | QCon London 2018 Coordinator Node Worker Nodes
  • 18. Co-Location R1 R2 R3 R4 S1 S2 S3 S4 Explicit Co-Location API. E.g. Partition by Tenant Sumedh Pathak | Citus Data | QCon London 2018 Coordinator Node Worker Nodes
  • 20. The key to scaling tables... - Use relational databases as a building block - Understand semantics of application—to be smart about partitioning - Multi-tenant applications
  • 22. FROM table R SELECT x Projectx (R) → R’ WHERE f(x) Filterf(x) (R) → R’ … JOIN … R × S → R’ SQL ↔ Relational Algebra Sumedh Pathak | Citus Data | QCon London 2018
  • 23. FROM sharded_table Collect(R1 ,R2 ,...) → R Distributed Relational Algebra R1 R2 C Sumedh Pathak | Citus Data | QCon London 2018
  • 24. Projectx (Collect(R1 ,R2 ,...)) = Collect(Projectx (R1 ), Projectx (R2 )...) Commutative property R1 R2 C R1 R2 C Px Px Px Sumedh Pathak | Citus Data | QCon London 2018
  • 25. Collect(R1 ,R2 ,...) x Collect(S1 ,S2 ,...) = Collect(R1 × S1 ,R2 × S2, ...) Distributive property R1 R2 C × C S1 S2 R1 R2 C × S1 S2 × X = Join Operator
  • 26. SUM(x)(Collect(R1 ,R2 ,...)) = SUM(Collect(SUM(R1 ), SUM(R2 )...)) Associative property R1 R2 C R1 R2 C Sumx Sumx Sumx Sumedh Pathak | Citus Data | QCon London 2018 Sumx
  • 27. SELECT sum(price) FROM orders, nation WHERE orders.nation = nation.name AND orders.date >= '2012-01-01' AND nation.region = 'Asia'; Sumedh Pathak | Citus Data | QCon London 2018
  • 28. Sumedh Pathak | Citus Data | QCon London 2018
  • 29. Sumedh Pathak | Citus Data | QCon London 2018 Volcano style processing Data flows from bottom to top
  • 30. Sumedh Pathak | Citus Data | QCon London 2018
  • 31. Sumedh Pathak | Citus Data | QCon London 2018
  • 32. Sumedh Pathak | Citus Data | QCon London 2018 Parallelize Aggregate Push Joins & Filters below collect. Run in parallel across all nodes Filters & Projections done before Join
  • 33. SELECT sum(intermediate_col) FROM <concatenated results>; SELECT sum(price) FROM orders_2 JOIN nation_2 ON (orders_2.name = nation_2.name) WHERE orders_2.date >= '2017-01-01' AND nation_2.region = 'Asia'; SELECT sum(price) FROM orders_2 JOIN nation_1 ON (orders_2.name = nation_1.name) WHERE orders_2.date >= '2017-01-01' AND nation_2.region = 'Asia'; Sumedh Pathak | Citus Data | QCon London 2018
  • 34. Executing Distributed SQL SQL database orders_1 nation_1 orders nation SELECT sum(price) FROM orders_2 o JOIN nation_1 ON (o.name = n.name) WHERE o.date >= '2017-01-01' AND n.region = 'Asia'; SELECT sum(price) FROM orders_1 o JOIN nation_1 ON (o.name = n.name) WHERE o.date >= '2017-01-01' AND n.region = 'Asia'; SELECT sum(price) FROM <results>; SQL database orders_2 nation_1
  • 35. The key to scaling SQL... - New relational algebra operators for distributed processing - Relational Algebra Properties to optimize tree: Commutativity, Associativity, & Distributivity - Map / Reduce operators
  • 37. Money Transfer, as an example BEGIN; UPDATE accounts SET balance = balance - 100 WHERE id = ‘ALICE’; UPDATE accounts SET balance = balance + 100 WHERE id = ‘BOB’; COMMIT; Sumedh Pathak | Citus Data | QCon London 2018
  • 38. A1 A2 BEGIN; UPDATE accounts SET balance = balance - 100 WHERE id = ‘ALICE’; UPDATE accounts SET balance = balance + 100 WHERE id = ‘BOB’; COMMIT; A3 A4 Coordinator Sumedh Pathak | Citus Data | QCon London 2018
  • 39. A1 A2 BEGIN; UPDATE accounts SET balance = balance - 100 WHERE id = ‘ALICE’; UPDATE accounts SET balance = balance + 100 WHERE id = ‘BOB’; COMMIT; A3 A4 BEGIN; UPDATE accounts SET balance = balance - 100 WHERE id = ‘ALICE’; Sumedh Pathak | Citus Data | QCon London 2018 Coordinator
  • 40. A1 A2 BEGIN; UPDATE accounts SET balance = balance - 100 WHERE id = ‘ALICE’; UPDATE accounts SET balance = balance + 100 WHERE id = ‘BOB’; COMMIT; A3 A4 BEGIN; UPDATE accounts SET balance = balance - 100 WHERE id = ‘ALICE’; COMMIT; Sumedh Pathak | Citus Data | QCon London 2018 Coordinator
  • 41. A1 A2 BEGIN; UPDATE accounts SET balance = balance - 100 WHERE id = ‘ALICE’; UPDATE accounts SET balance = balance + 100 WHERE id = ‘BOB’; COMMIT; A3 A4 Coordinator BEGIN; UPDATE accounts SET balance = balance + 100 WHERE id = ‘BOB’; Sumedh Pathak | Citus Data | QCon London 2018 BEGIN; UPDATE accounts SET balance = balance - 100 WHERE id = ‘ALICE’;
  • 42. A1 A2 BEGIN; UPDATE accounts SET balance = balance - 100 WHERE id = ‘ALICE’; UPDATE accounts SET balance = balance + 100 WHERE id = ‘BOB’; COMMIT; A3 A4 BEGIN; UPDATE accounts SET balance = balance + 100 WHERE id = ‘BOB’; COMMIT Sumedh Pathak | Citus Data | QCon London 2018 BEGIN; UPDATE accounts SET balance = balance - 100 WHERE id = ‘ALICE’; Coordinator
  • 43. A1 A2 BEGIN; UPDATE accounts SET balance = balance - 100 WHERE id = ‘ALICE’; UPDATE accounts SET balance = balance + 100 WHERE id = ‘BOB’; COMMIT; A3 A4 PREPARE TRANSACTION ‘citus_...98’; PREPARE TRANSACTION ‘citus_...98’; Sumedh Pathak | Citus Data | QCon London 2018 Coordinator
  • 44. What happens during PREPARE? State of transaction stored on a durable store Locks are maintained &
  • 45. A1 A2 BEGIN; UPDATE accounts SET balance = balance - 100 WHERE id = ‘ALICE’; UPDATE accounts SET balance = balance + 100 WHERE id = ‘BOB’; COMMIT; A3 A4 Coordinator PREPARE TRANSACTION ‘citus_...98’; ROLLBACK TRANSACTION ‘citus_...98’; Sumedh Pathak | Citus Data | QCon London 2018
  • 46. A1 A2 BEGIN; UPDATE accounts SET balance = balance - 100 WHERE id = ‘ALICE’; UPDATE accounts SET balance = balance + 100 WHERE id = ‘BOB’; COMMIT; A3 A4 PREPARE TRANSACTION ‘citus_...98’; PREPARE TRANSACTION ‘citus_...98’; Sumedh Pathak | Citus Data | QCon London 2018 Coordinator
  • 47. A1 A2 BEGIN; UPDATE accounts SET balance = balance - 100 WHERE id = ‘ALICE’; UPDATE accounts SET balance = balance + 100 WHERE id = ‘BOB’; COMMIT; A3 A4 COMMIT PREPARED ‘citus_...98’; COMMIT PREPARED ‘citus_...98’; Sumedh Pathak | Citus Data | QCon London 2018 Coordinator
  • 48. A1 A2 BEGIN; UPDATE accounts SET balance = balance - 100 WHERE id = ‘ALICE’; UPDATE accounts SET balance = balance + 100 WHERE id = ‘BOB’; COMMIT; A3 A4 COMMIT PREPARED ‘citus_...98’; COMMIT PREPARED ‘citus_...98’; Sumedh Pathak | Citus Data | QCon London 2018 Coordinator
  • 49. A1 A2 BEGIN; UPDATE accounts SET balance = balance - 100 WHERE id = ‘ALICE’; UPDATE accounts SET balance = balance + 100 WHERE id = ‘BOB’; COMMIT; A3 A4 COMMIT PREPARED ‘citus_...98’; COMMIT PREPARED ‘citus_...98’; Sumedh Pathak | Citus Data | QCon London 2018 Coordinator
  • 51. Example // SESSION 1 BEGIN; UPDATE accounts SET balance = balance - 100 WHERE id = ‘ALICE’; (LOCK on ROW with id ‘ALICE’) // SESSION 2 Sumedh Pathak | Citus Data | QCon London 2018
  • 52. Example // SESSION 1 BEGIN; UPDATE accounts SET balance = balance - 100 WHERE id = ‘ALICE’; (LOCK on ROW with id ‘ALICE’) // SESSION 2 BEGIN; UPDATE accounts SET balance = balance + 100 WHERE id = ‘BOB’; (LOCK on ROW with id ‘BOB’) Sumedh Pathak | Citus Data | QCon London 2018
  • 53. Example // SESSION 1 BEGIN; UPDATE accounts SET balance = balance - 100 WHERE id = ‘ALICE’; (LOCK on ROW with id ‘ALICE’) UPDATE accounts SET balance = balance + 100 WHERE id = ‘BOB’; // SESSION 2 BEGIN; UPDATE accounts SET balance = balance + 100 WHERE id = ‘BOB’; (LOCK on ROW with id ‘BOB’) Sumedh Pathak | Citus Data | QCon London 2018
  • 54. Example // SESSION 1 BEGIN; UPDATE accounts SET balance = balance - 100 WHERE id = ‘ALICE’; (LOCK on ROW with id ‘ALICE’) UPDATE accounts SET balance = balance + 100 WHERE id = ‘BOB’; // SESSION 2 BEGIN; UPDATE accounts SET balance = balance + 100 WHERE id = ‘BOB’; (LOCK on ROW with id ‘BOB’) UPDATE accounts SET balance = balance - 100 WHERE id = ‘ALICE’ Sumedh Pathak | Citus Data | QCon London 2018
  • 55. How do Relational DB’s solve this? S1 S2 - Construct a Directed Graph - Each node is a session/transaction - Edge represents a wait on a lock Waiting on ‘Bob’ Waiting on ‘Alice’
  • 56. S1 S2 S1 S2 S2 S1 S2 Waits on S1 S1 Waits on S2 Sumedh Pathak | Citus Data | QCon London 2018
  • 57. S1 S2 S1 S2 S2 S1 S2 Waits on S1 S1 Waits on S2 Distributed Deadlock Detector S1 S2 S2 S1 Sumedh Pathak | Citus Data | QCon London 2018
  • 58. keys to scaling transactions - 2PC to ensure atomic transactions across nodes - Deadlock Detection—to scale complex & concurrent transaction workloads - MVCC - Failure Handling 4
  • 59. It’s 2018. Distributed can be Relational - Scale tables—via sharding - Scale SQL—via distributed relational algebra - Scale transactions—via 2PC & Deadlock Detection Sumedh Pathak | Citus Data | QCon London 2018 Now, how do we implement all of this?
  • 60. Postgres Sumedh Pathak | Citus Data | QCon London 2018
  • 61. PostgreSQL Planner Executor Custom scan Commit / abort Extension (.so) Access methods Foreign tables Functions ... ... ... ... ... ... ... CREATE EXTENSION ...
  • 62. PostgreSQL PostgreSQL PostgreSQL shardsshardsshard shardsshardsshard SELECT … FROM distributed_table … SELECT … FROM shard… SELECT … FROM shard… Citus
  • 63. PostgreSQL Citus PostGIS PL/Python JSONB 2PC Replication... Sequences Indexes Full-Text SearchTransactions … dblink Sumedh Pathak | Citus Data | QCon London 2018 Rich ecosystem of tooling, data types, & extensions in PostgreSQL
  • 64. PostgreSQL can be extended into an all-purpose distributed RDBMS
  • 65. I believe the future of distributed databases is relational.
  • 66. Thank you! Sumedh Pathak sumedh@citusdata.com @moss_toss | @citusdata | citusdata.com QCon London 2018 | The Future of Distributed Databases is Relational
  • 67. Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/future- distributed-database-relational