Years ago when working at Amazon on shopping cart infrastructure and the precursor to DynamoDB, my co-founder and I realized that while distributed key value stores were useful for a few use-cases, we missed many of the benefits of relational databases: transactions, joins, and the power of the lingua franca of RDBMS’s: SQL. So we challenged ourselves to modernize the traditional relational database, to take a robust open source relational database and transform it into a distributed database.
This talk is about my team’s journey to create a more modern relational database. I’ll talk about the distributed systems problems we had to solve in order to scale out the Postgres open source database, in order to achieve parallelism and a concomitant increase in performance. I'll describe the architecture of the distributed query planner; how we extend traditional relational algebra operators to plan distributed queries and scale reads. I’ll also describe distributed deadlock detection, and how that enabled us to scale out transactions spanning multiple machines.
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
The Future of Distributed Databases is Relational
1. Sumedh Pathak, Co-Founder & VP Engineering, Citus Data
QCon London 2018
The Future of Distributed
Databases is Relational
@moss_toss | @citusdata
2. About Me
• Co-Founder & VP
Engineering at Citus Data
• Amazon Shopping Cart
(former)
• Amazon Supply Chain &
Order Fulfillment (former)
• Stanford Computer Science
Citus Data co-founders, left-to-right
Ozgun Erdogan, Sumedh Pathak, & Umur Cubukcu
Photo cred: Willy Johnson, Monterey CA, Sep 2017
Sumedh Pathak | Citus Data | QCon London 2018
6. An RDBMS is a
general-purpose
data platform
Sumedh Pathak | Citus Data | QCon London 2018
Fast writes
Real-time & bulk
High throughput
High concurrency
Data consistency
Query optimizers
7. PostgreSQL
MySQL
MongoDB
SQL Server +
Oracle
Source: Hacker News, https://news.ycombinator.com
Startups Are Choosing Postgres% database job posts mentioning each database, across 20K+ job posts
Sumedh Pathak | Citus Data | QCon London 2018
9. but RDBMS’s don’t scale
RDBMS’s are hard to scale
Sumedh Pathak | Citus Data | QCon London 2018
10. What exactly needs to Scale?
- Tables (Data)
- Partitioning, Co-location, Reference Tables
- SQL (Reads)
- How do we express and optimize distributed SQL
- Transactions (Writes)
- Cross Shard updates/deletes, Global Atomic Transactions
Sumedh Pathak | Citus Data | QCon London 2018
1
2
3
18. The key to scaling tables...
- Use relational databases as a building block
- Understand semantics of application—to be smart about
partitioning
- Multi-tenant applications
20. FROM table R
SELECT x Projectx
(R) → R’
WHERE f(x) Filterf(x)
(R) → R’
… JOIN … R × S → R’
SQL ↔ Relational Algebra
Sumedh Pathak | Citus Data | QCon London 2018
25. SELECT sum(price) FROM orders, nation
WHERE orders.nation = nation.name AND
orders.date >= '2012-01-01' AND
nation.region = 'Asia';
Sumedh Pathak | Citus Data | QCon London 2018
30. Sumedh Pathak | Citus Data | QCon London 2018
Parallelize
Aggregate
Push Joins & Filters
below collect. Run in
parallel across all
nodes
Filters & Projections
done before Join
31. SELECT sum(intermediate_col)
FROM <concatenated results>;
SELECT sum(price)
FROM orders_2 JOIN nation_2
ON (orders_2.name = nation_2.name)
WHERE
orders_2.date >= '2017-01-01'
AND
nation_2.region = 'Asia';
SELECT sum(price)
FROM orders_2 JOIN nation_1
ON (orders_2.name = nation_1.name)
WHERE
orders_2.date >= '2017-01-01'
AND
nation_2.region = 'Asia';
Sumedh Pathak | Citus Data | QCon London 2018
32. Executing Distributed SQL
SQL database
orders_1
nation_1
orders
nation
SELECT sum(price)
FROM orders_2 o JOIN nation_1
ON (o.name = n.name)
WHERE o.date >= '2017-01-01'
AND n.region = 'Asia';
SELECT sum(price)
FROM orders_1 o JOIN nation_1
ON (o.name = n.name)
WHERE o.date >= '2017-01-01'
AND n.region = 'Asia';
SELECT sum(price) FROM <results>;
SQL database
orders_2
nation_1
33. The key to scaling SQL...
- New relational algebra operators for
distributed processing
- Relational Algebra Properties to optimize
tree: Commutativity, Associativity, &
Distributivity
- Map / Reduce operators
35. Money Transfer, as an example
BEGIN;
UPDATE accounts SET balance = balance -
100 WHERE id = ‘ALICE’;
UPDATE accounts SET balance = balance +
100 WHERE id = ‘BOB’;
COMMIT;
Sumedh Pathak | Citus Data | QCon London 2018
36. A1
A2
BEGIN;
UPDATE accounts SET balance = balance -
100 WHERE id = ‘ALICE’;
UPDATE accounts SET balance = balance +
100 WHERE id = ‘BOB’;
COMMIT;
A3
A4
Coordinator
Sumedh Pathak | Citus Data | QCon London 2018
37. A1
A2
BEGIN;
UPDATE accounts SET balance = balance -
100 WHERE id = ‘ALICE’;
UPDATE accounts SET balance = balance +
100 WHERE id = ‘BOB’;
COMMIT;
A3
A4
BEGIN;
UPDATE accounts SET balance = balance -
100 WHERE id = ‘ALICE’;
Sumedh Pathak | Citus Data | QCon London 2018
Coordinator
38. A1
A2
BEGIN;
UPDATE accounts SET balance = balance -
100 WHERE id = ‘ALICE’;
UPDATE accounts SET balance = balance +
100 WHERE id = ‘BOB’;
COMMIT;
A3
A4
BEGIN;
UPDATE accounts SET balance = balance -
100 WHERE id = ‘ALICE’;
COMMIT;
Sumedh Pathak | Citus Data | QCon London 2018
Coordinator
39. A1
A2
BEGIN;
UPDATE accounts SET balance = balance -
100 WHERE id = ‘ALICE’;
UPDATE accounts SET balance = balance +
100 WHERE id = ‘BOB’;
COMMIT;
A3
A4
Coordinator
BEGIN;
UPDATE accounts SET balance =
balance + 100 WHERE id = ‘BOB’;
Sumedh Pathak | Citus Data | QCon London 2018
BEGIN;
UPDATE accounts SET balance = balance
- 100 WHERE id = ‘ALICE’;
40. A1
A2
BEGIN;
UPDATE accounts SET balance = balance -
100 WHERE id = ‘ALICE’;
UPDATE accounts SET balance = balance +
100 WHERE id = ‘BOB’;
COMMIT;
A3
A4
BEGIN;
UPDATE accounts SET balance =
balance + 100 WHERE id = ‘BOB’;
COMMIT
Sumedh Pathak | Citus Data | QCon London 2018
BEGIN;
UPDATE accounts SET balance = balance
- 100 WHERE id = ‘ALICE’;
Coordinator
41. A1
A2
BEGIN;
UPDATE accounts SET balance = balance -
100 WHERE id = ‘ALICE’;
UPDATE accounts SET balance = balance +
100 WHERE id = ‘BOB’;
COMMIT;
A3
A4
PREPARE TRANSACTION ‘citus_...98’; PREPARE TRANSACTION ‘citus_...98’;
Sumedh Pathak | Citus Data | QCon London 2018
Coordinator
42. What happens during PREPARE?
State of transaction stored
on a durable store
Locks are
maintained
&
43. A1
A2
BEGIN;
UPDATE accounts SET balance = balance -
100 WHERE id = ‘ALICE’;
UPDATE accounts SET balance = balance +
100 WHERE id = ‘BOB’;
COMMIT;
A3
A4
Coordinator
PREPARE TRANSACTION ‘citus_...98’; ROLLBACK TRANSACTION ‘citus_...98’;
Sumedh Pathak | Citus Data | QCon London 2018
44. A1
A2
BEGIN;
UPDATE accounts SET balance = balance -
100 WHERE id = ‘ALICE’;
UPDATE accounts SET balance = balance +
100 WHERE id = ‘BOB’;
COMMIT;
A3
A4
PREPARE TRANSACTION ‘citus_...98’; PREPARE TRANSACTION ‘citus_...98’;
Sumedh Pathak | Citus Data | QCon London 2018
Coordinator
45. A1
A2
BEGIN;
UPDATE accounts SET balance = balance -
100 WHERE id = ‘ALICE’;
UPDATE accounts SET balance = balance +
100 WHERE id = ‘BOB’;
COMMIT;
A3
A4
COMMIT PREPARED ‘citus_...98’; COMMIT PREPARED ‘citus_...98’;
Sumedh Pathak | Citus Data | QCon London 2018
Coordinator
46. A1
A2
BEGIN;
UPDATE accounts SET balance = balance -
100 WHERE id = ‘ALICE’;
UPDATE accounts SET balance = balance +
100 WHERE id = ‘BOB’;
COMMIT;
A3
A4
COMMIT PREPARED ‘citus_...98’; COMMIT PREPARED ‘citus_...98’;
Sumedh Pathak | Citus Data | QCon London 2018
Coordinator
47. A1
A2
BEGIN;
UPDATE accounts SET balance = balance -
100 WHERE id = ‘ALICE’;
UPDATE accounts SET balance = balance +
100 WHERE id = ‘BOB’;
COMMIT;
A3
A4
COMMIT PREPARED ‘citus_...98’; COMMIT PREPARED ‘citus_...98’;
Sumedh Pathak | Citus Data | QCon London 2018
Coordinator
49. Example
// SESSION 1
BEGIN;
UPDATE accounts SET balance
= balance - 100 WHERE id =
‘ALICE’;
(LOCK on ROW with id
‘ALICE’)
// SESSION 2
Sumedh Pathak | Citus Data | QCon London 2018
50. Example
// SESSION 1
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE
id = ‘ALICE’;
(LOCK on ROW with id ‘ALICE’)
// SESSION 2
BEGIN;
UPDATE accounts SET balance
= balance + 100 WHERE id =
‘BOB’;
(LOCK on ROW with id ‘BOB’)
Sumedh Pathak | Citus Data | QCon London 2018
51. Example
// SESSION 1
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE
id = ‘ALICE’;
(LOCK on ROW with id ‘ALICE’)
UPDATE accounts SET balance
= balance + 100 WHERE id =
‘BOB’;
// SESSION 2
BEGIN;
UPDATE accounts SET balance = balance + 100 WHERE
id = ‘BOB’;
(LOCK on ROW with id ‘BOB’)
Sumedh Pathak | Citus Data | QCon London 2018
52. Example
// SESSION 1
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE
id = ‘ALICE’;
(LOCK on ROW with id ‘ALICE’)
UPDATE accounts SET balance = balance + 100
WHERE id = ‘BOB’;
// SESSION 2
BEGIN;
UPDATE accounts SET balance = balance + 100 WHERE
id = ‘BOB’;
(LOCK on ROW with id ‘BOB’)
UPDATE accounts SET balance
= balance - 100 WHERE id =
‘ALICE’
Sumedh Pathak | Citus Data | QCon London 2018
53. How do Relational DB’s solve this?
S1
S2
- Construct a
Directed Graph
- Each node is a
session/transaction
- Edge represents a
wait on a lock
Waiting
on ‘Bob’
Waiting on
‘Alice’
54. S1
S2
S1 S2
S2 S1
S2 Waits on S1 S1 Waits on S2
Sumedh Pathak | Citus Data | QCon London 2018
55. S1
S2
S1 S2
S2 S1
S2 Waits on S1 S1 Waits on S2
Distributed
Deadlock
Detector
S1
S2 S2
S1
Sumedh Pathak | Citus Data | QCon London 2018
56. keys to scaling transactions
- 2PC to ensure atomic transactions across nodes
- Deadlock Detection—to scale complex & concurrent
transaction workloads
- MVCC
- Failure Handling
4
57. It’s 2018. Distributed can be Relational
- Scale tables—via sharding
- Scale SQL—via distributed relational algebra
- Scale transactions—via 2PC & Deadlock Detection
Sumedh Pathak | Citus Data | QCon London 2018
Now, how do we implement all of this?