Outline
1. Mourning the death of transactions
2. What is so hard about distributed systems?
3. Distributed consistency: managing asynchrony
4. Fault-tolerance: progress despite failures
The transaction concept
DEBIT_CREDIT:
BEGIN_TRANSACTION;
GET
MESSAGE;
EXTRACT
ACCOUT_NUMBER,
DELTA,
TELLER,
BRANCH
FROM
MESSAGE;
FIND
ACCOUNT(ACCOUT_NUMBER)
IN
DATA
BASE;
IF
NOT_FOUND
|
ACCOUNT_BALANCE
+
DELTA
<
0
THEN
PUT
NEGATIVE
RESPONSE;
ELSE
DO;
ACCOUNT_BALANCE
=
ACCOUNT_BALANCE
+
DELTA;
POST
HISTORY
RECORD
ON
ACCOUNT
(DELTA);
CASH_DRAWER(TELLER)
=
CASH_DRAWER(TELLER)
+
DELTA;
BRANCH_BALANCE(BRANCH)
=
BRANCH_BALANCE(BRANCH)
+
DELTA;
PUT
MESSAGE
('NEW
BALANCE
='
ACCOUNT_BALANCE);
END;
COMMIT;
The transaction concept
DEBIT_CREDIT:
BEGIN_TRANSACTION;
GET
MESSAGE;
EXTRACT
ACCOUT_NUMBER,
DELTA,
TELLER,
BRANCH
FROM
MESSAGE;
FIND
ACCOUNT(ACCOUT_NUMBER)
IN
DATA
BASE;
IF
NOT_FOUND
|
ACCOUNT_BALANCE
+
DELTA
<
0
THEN
PUT
NEGATIVE
RESPONSE;
ELSE
DO;
ACCOUNT_BALANCE
=
ACCOUNT_BALANCE
+
DELTA;
POST
HISTORY
RECORD
ON
ACCOUNT
(DELTA);
CASH_DRAWER(TELLER)
=
CASH_DRAWER(TELLER)
+
DELTA;
BRANCH_BALANCE(BRANCH)
=
BRANCH_BALANCE(BRANCH)
+
DELTA;
PUT
MESSAGE
('NEW
BALANCE
='
ACCOUNT_BALANCE);
END;
COMMIT;
The transaction concept
DEBIT_CREDIT:
BEGIN_TRANSACTION;
GET
MESSAGE;
EXTRACT
ACCOUT_NUMBER,
DELTA,
TELLER,
BRANCH
FROM
MESSAGE;
FIND
ACCOUNT(ACCOUT_NUMBER)
IN
DATA
BASE;
IF
NOT_FOUND
|
ACCOUNT_BALANCE
+
DELTA
<
0
THEN
PUT
NEGATIVE
RESPONSE;
ELSE
DO;
ACCOUNT_BALANCE
=
ACCOUNT_BALANCE
+
DELTA;
POST
HISTORY
RECORD
ON
ACCOUNT
(DELTA);
CASH_DRAWER(TELLER)
=
CASH_DRAWER(TELLER)
+
DELTA;
BRANCH_BALANCE(BRANCH)
=
BRANCH_BALANCE(BRANCH)
+
DELTA;
PUT
MESSAGE
('NEW
BALANCE
='
ACCOUNT_BALANCE);
END;
COMMIT;
The transaction concept
DEBIT_CREDIT:
BEGIN_TRANSACTION;
GET
MESSAGE;
EXTRACT
ACCOUT_NUMBER,
DELTA,
TELLER,
BRANCH
FROM
MESSAGE;
FIND
ACCOUNT(ACCOUT_NUMBER)
IN
DATA
BASE;
IF
NOT_FOUND
|
ACCOUNT_BALANCE
+
DELTA
<
0
THEN
PUT
NEGATIVE
RESPONSE;
ELSE
DO;
ACCOUNT_BALANCE
=
ACCOUNT_BALANCE
+
DELTA;
POST
HISTORY
RECORD
ON
ACCOUNT
(DELTA);
CASH_DRAWER(TELLER)
=
CASH_DRAWER(TELLER)
+
DELTA;
BRANCH_BALANCE(BRANCH)
=
BRANCH_BALANCE(BRANCH)
+
DELTA;
PUT
MESSAGE
('NEW
BALANCE
='
ACCOUNT_BALANCE);
END;
COMMIT;
Transactions: a holistic contract
Write
Read
Application
Opaque
store
Transactions
Assert:
balance > 0
Transactions: a holistic contract
Assert:
balance > 0
Write
Read
Application
Opaque
store
Transactions
Transactions: a holistic contract
Write
Read
Application
Opaque
store
Transactions
Assert:
balance > 0
Transactions: a holistic contract
Write
Read
Application
Opaque
store
Transactions
Assert:
balance > 0
Incidental complexities
• The “Internet.” Searching it.
• Cross-datacenter replication schemes
• CAP Theorem
• Dynamo & MapReduce
• “Cloud”
Fundamental complexity
“[…] distributed systems require that the
programmer be aware of latency, have a different
model of memory access, and take into account
issues of concurrency and partial failure.”
Jim Waldo et al.,
A Note on Distributed Computing (1994)
A holistic contract
…stretched to the limit
Write
Read
Application
Opaque
store
Transactions
A holistic contract
…stretched to the limit
Write
Read
Application
Opaque
store
Transactions
Are you blithely asserting
that transactions aren’t webscale?
Some people just want to see the world burn.
Those same people want to see the world use inconsistent databases.
- Emin Gun Sirer
Alternative to top-down design?
The “bottom-up,” systems tradition:
Simple, reusable components first.
Semantics later.
The “bottom-up” ethos
Simple, reusable components first.
Semantics later.
This is how we live now.
Question: Do we ever get those
application-level guarantees back?
When do contracts compose?
Application
Distributed
service
Assert:
balance > 0
iw, did I get mongo in my riak?
Assert:
balance > 0
Composition is the last hard
problem
Composing modules is hard enough
We must learn how to compose guarantees
Outline
1. Mourning the death of transactions
2. What is so hard about distributed systems?
3. Distributed consistency: managing asynchrony
4. Fault-tolerance: progress despite failures
(asynchrony * partial failure) = hard2
Tackling one clown at a time
Poor strategy for programming distributed systems
Winning strategy for analyzing distributed programs
Outline
1. Mourning the death of transactions
2. What is so hard about distributed systems?
3. Distributed consistency: managing asynchrony
4. Fault-tolerance: progress despite failures
Graph queries as dataflow
Graph
store
Memory
allocator
Transitive
closure
Garbage
collector
Confluent Not
Confluent
Confluent
Graph
store
Transaction
manager
Transitive
closure
Deadlock
detector
Confluent Confluent Confluent
Graph queries as dataflow
Graph
store
Memory
allocator
Confluent
Transitive
closure
Garbage
collector
Confluent Not
Confluent
Confluent
Graph
store
Transaction
manager
Transitive
closure
Deadlock
detector
Confluent Confluent Confluent
Coordinate
here
Coordination: what is that?
Strategy 1: Establish a total order
Graph
store
Memory
allocator
Coordinate
here
Transitive
closure
Garbage
collector
Confluent Not
Confluent
Confluent
Coordination: what is that?
Strategy 2: Establish a producer-consumer
Graph
store
Memory
allocator
Coordinate
here
Transitive
closure
Garbage
collector
Confluent Not
Confluent
Confluent
barrier
Fundamental costs: FT via replication
(mostly) free!
Graph
store
Transaction
manager
Transitive
closure
Deadlock
detector
Confluent Confluent Confluent
Graph
store
Transitive
closure
Deadlock
detector
Confluent Confluent Confluent
Fundamental costs: FT via replication
global synchronization!
Graph
store
Transaction
manager
Transitive
closure
Garbage
Collector
Confluent Confluent
Graph
store
Transitive
closure
Garbage
Collector
Confluent Not
Confluent
Confluent
Paxos
Not
Confluent
Fundamental costs: FT via replication
The first principle of successful scalability is to batter the
consistency mechanisms down to a minimum.
– James Hamilton
Garbage
Collector
Graph
store
Transaction
manager
Transitive
closure
Garbage
Collector
Confluent Confluent
Graph
store
Transitive
closure
Confluent Not
Confluent
Confluent
Barrier
Not
Confluent
Barrier
Language-level consistency
DSLs for distributed programming?
• Capture consistency concerns in the
type system
Application
Language
Flow
Object
Storage
Let’s review
• Consistency is tolerance to asynchrony
• Tricks:
– focus on data in motion, not at rest
– avoid coordination when possible
– choose coordination carefully otherwise
(Tricks are great, but tools are better)
Outline
1. Mourning the death of transactions
2. What is so hard about distributed systems?
3. Distributed consistency: managing asynchrony
4. Fault-tolerance: progress despite failures
Grand challenge: composition
Hard problem:
Is a given component fault-tolerant?
Much harder:
Is this system (built up from components)
fault-tolerant?
Example: Kafka replication bug
Three “correct” components:
1. Primary/backup replication
2. Timeout-based failure detectors
3. Zookeeper
One nasty bug:
Acknowledged writes are lost
A guarantee would be nice
Bottom up approach:
• use formal methods to verify individual
components (e.g. protocols)
• Build systems from verified components
Shortcomings:
• Hard to use
• Hard to compose
Investment
Returns
Composing bottom-up
assurances
Issue 1: incompatible failure models
eg, crash failure vs. omissions
Issue 2: Specs do not compose
(FT is an end-to-end property)
If you take 10 components off the shelf, you are putting 10 world views
together, and the result will be a mess. -- Butler Lampson
End-to-end testing
would be nice
Top-down approach:
• Build a large-scale system
• Test the system under faults
Shortcomings:
• Hard to identify complex bugs
• Fundamentally incomplete
Investment
Returns
Lineage-driven fault injection
Goal: top-down testing that
• finds all of the fault-tolerance bugs, or
• certifies that none exist
Lineage-driven fault injection
(LDFI)
Approach: think backwards from outcomes
Question: could a bad thing ever happen?
Reframe:
• Why did a good thing happen?
• What could have gone wrong along the way?
Thomasina: What a faint-heart! We must
work outward from the middle of the
maze. We will start with something simple.
The game
• Both players agree on a failure model
• The programmer provides a protocol
• The adversary observes executions and
chooses failures for the next execution.
Dedalus: it’s about time
consequence@when ! :- premise[s]!
!!
node(Node, Neighbor)@next :- node(Node, Neighbor);!
!!
log(Node2, Pload)@async :- bcast(Node1, Pload),
! ! ! ! ! ! ! ! ! node(Node1, Node2);
State change
Natural join (bcast.Node1 == node.Node1)
Communication
The match
Protocol:
Reliable broadcast
Specification:
Pre: A correct process delivers a message m
Post: All correct process delivers m
Failure Model:
(Permanent) crash failures
Message loss / partitions
An execution is a (fragile) “proof”
of an outcome
log(A, data)@1 node(A, B)@1
AB1 r2
log(B, data)@2
r1
log(B, data)@3
r1
log(B, data)@4
r1
log(B, data)@5
log(log(AB2 log(A, data)@1
r1
log(A, data)@2
r1
log(A, data)@3
node(A, B)@1
r3
node(A, B)@2
r3
node(A, B)@3
AB3 r2
log(B, data)@4
log(log(log(log((which required a message from A to B at time 1)
Round
2:
counterexample
Process b Process a Process c
1
log (LOST) log
CRASHED 2
The adversary wins!
Round 3
Same
as
in
Round
2,
but
symmetrical.
bcast(N, P)@next ! ! ! :- log(N, P);!
Round 3 in space / time
Process b Process a Process c
2
3
4
5
1
log log
2
3
4
5
2
3
4
5
log log
log log
log log
log log
log log
log log
log log
log log
log log
Redundancy in
space and time
Let’s reflect
Fault-tolerance is redundancy in space and
time.
Best strategy for both players: reason
backwards from outcomes using lineage
Finding bugs: find a set of failures that
“breaks” all derivations
Fixing bugs: add additional derivations
The role of the adversary
can be automated
1. Break a proof by dropping any contributing
message.
(AB1 ∨ BC2)
Disjunction
The role of the adversary
can be automated
1. Break a proof by dropping any contributing
message.
2. Find a set of failures that breaks all proofs
of a good outcome.
(AB1 ∨ BC2)
Disjunction
∧ (AC1) ∧ (AC2)
Conjunction of disjunctions (AKA CNF)
The role of the adversary
can be automated
1. Break a proof by dropping any contributing
message.
2. Find a set of failures that breaks all proofs
of a good outcome.
(AB1 ∨ BC2)
Disjunction
∧ (AC1) ∧ (AC2)
Conjunction of disjunctions (AKA CNF)
Molly, the LDFI prototype
Molly finds fault-tolerance violations quickly
or guarantees that none exist.
Molly finds bugs by explaining good
outcomes – then it explains the bugs.
Bugs identified: 2pc, 2pc-ctp, 3pc, Kafka
Certified correct: paxos (synod), Flux, bully
leader election, reliable broadcast
Commit protocols
Problem:
Atomically change things
Correctness properties:
1. Agreement (All or nothing)
2. Termination (Something)
Two-phase commit
Agent a Agent b Coordinator Agent d
2
5
2
5
1
prepare prepare prepare
3
4
2
5
vote vote
vote
commit commit commit
Two-phase commit
Agent a Agent b Coordinator Agent d
2
5
2
5
1
prepare prepare prepare
3
4
2
5
vote vote
vote
commit commit commit
Can I kick it?
Two-phase commit
Agent a Agent b Coordinator Agent d
2
5
2
5
1
prepare prepare prepare
3
4
2
5
vote vote
vote
commit commit commit
Can I kick it?
YES YOU CAN
Two-phase commit
Agent a Agent b Coordinator Agent d
2
5
2
5
1
prepare prepare prepare
3
4
2
5
vote vote
vote
commit commit commit
Can I kick it?
YES YOU CAN
Well I’m gone
Two-phase commit
Agent a Agent a Coordinator Agent d
2 2
1
p p p
3
CRASHED
2
v v
v
Violation: Termination
The
collabora[ve
termina[on
protocol
Basic idea:
Agents talk amongst themselves when the
coordinator fails.
Protocol: On timeout, ask other agents
about decision.
2PC - CTP
Agent a Agent b Coordinator Agent d
2
3
4
5
6
7
prepare prepare prepare
2
3
4
5
6
7
1
2
3
CRASHED
2
3
4
5
6
7
vote
decision_req decision_req
vote
decision_req decision_req
vote
decision_req decision_req
Can I kick it?
YES YOU CAN
……?
3PC
Basic idea:
Add a round, a state, and simple failure
detectors (timeouts).
Protocol:
1. Phase 1: Just like in 2PC
– Agent timeout à abort
2. Phase 2: send canCommit, collect acks
– Agent timeout à commit
3. Phase 3: Just like phase 2 of 2PC
3PC
Process a Process b Process C Process d
2
4
7
2
4
7
1
cancommit cancommit cancommit
3
vote_msg
precommit precommit precommit
5
6
2
4
7
vote_msg
ack
vote_msg
ack
ack
commit commit commit
3PC
Process a Process b Process C Process d
2
4
7
2
4
7
1
cancommit cancommit cancommit
3
vote_msg
precommit precommit precommit
5
6
2
4
7
vote_msg
ack
vote_msg
ack
ack
commit commit commit
Timeout
à Abort
Timeout
à Commit
Network partitions
make 3pc act crazy
Process a Process b Process C Process d
2
4
7
8
2
4
7
8
1
3
5
6
7
8
2
CRASHED
vote_msg
ack
commit
vote_msg
ack
commit
cancommit cancommit cancommit
precommit precommit precommit
abort (LOST) abort (LOST)
abort abort
vote_msg
Network partitions
make 3pc act crazy
Process a Process b Process C Process d
2
4
7
8
2
4
7
8
1
3
5
6
7
8
2
CRASHED
vote_msg
ack
commit
vote_msg
ack
commit
cancommit cancommit cancommit
precommit precommit precommit
abort (LOST) abort (LOST)
abort abort
vote_msg
Agent crash
Agents learn
commit decision
Network partitions
make 3pc act crazy
Process a Process b Process C Process d
2
4
7
8
2
4
7
8
1
3
5
6
7
8
2
CRASHED
vote_msg
ack
commit
vote_msg
ack
commit
cancommit cancommit cancommit
precommit precommit precommit
abort (LOST) abort (LOST)
abort abort
vote_msg
Agent crash
Agents learn
commit decision
d is dead; coordinator
decides to abort
Network partitions
make 3pc act crazy
Process a Process b Process C Process d
2
4
7
8
2
4
7
8
1
3
5
6
7
8
2
CRASHED
vote_msg
ack
commit
vote_msg
ack
commit
cancommit cancommit cancommit
precommit precommit precommit
abort (LOST) abort (LOST)
abort abort
vote_msg
Brief network
partition
Agent crash
Agents learn
commit decision
d is dead; coordinator
decides to abort
Network partitions
make 3pc act crazy
Process a Process b Process C Process d
2
4
7
8
2
4
7
8
1
3
5
6
7
8
2
CRASHED
vote_msg
ack
commit
vote_msg
ack
commit
cancommit cancommit cancommit
precommit precommit precommit
abort (LOST) abort (LOST)
abort abort
vote_msg
Brief network
partition
Agent crash
Agents learn
commit decision
d is dead; coordinator
decides to abort
Agents A & B
decide to
commit
Kafka durability bug
Replica b Replica c Zookeeper Replica a Client
1 1
2
1
3
4
CRASHED
1
3
5
m m
m
m l
a
c
w
Kafka durability bug
Replica b Replica c Zookeeper Replica a Client
1 1
2
1
3
4
CRASHED
1
3
5
m m
m
m l
a
c
w
Brief network
partition
Kafka durability bug
Replica b Replica c Zookeeper Replica a Client
1 1
2
1
3
4
CRASHED
1
3
5
m m
m
m l
a
c
w
Brief network
partition
a becomes
leader and
sole replica
Kafka durability bug
Replica b Replica c Zookeeper Replica a Client
1 1
2
1
3
4
CRASHED
1
3
5
m m
m
m l
a
c
w
Brief network
partition
a becomes
leader and
sole replica
a ACKs
client write
Kafka durability bug
Replica b Replica c Zookeeper Replica a Client
1 1
2
1
3
4
CRASHED
1
3
5
m m
m
m l
a
c
w
Brief network
partition
a becomes
leader and
sole replica
a ACKs
client write
Data
loss
Molly summary
Lineage allows us to reason backwards
from good outcomes
Molly: surgically-targeted fault injection
Investment similar to testing
Returns similar to formal methods
Where we’ve been; where we’re headed
1. Mourning the death of transactions
2. What is so hard about distributed systems?
3. Distributed consistency: managing asynchrony
4. Fault-tolerance: progress despite failures
Where we’ve been; where we’re headed
1. We need application-level guarantees
2. What is so hard about distributed systems?
3. Distributed consistency: managing asynchrony
4. Fault-tolerance: progress despite failures
Where we’ve been; where we’re headed
1. We need application-level guarantees
2. What is so hard about distributed systems?
3. Distributed consistency: managing asynchrony
4. Fault-tolerance: progress despite failures
Where we’ve been; where we’re headed
1. We need application-level guarantees
2. (asynchrony X partial failure) = too hard to
hide! We need tools to manage it.
3. Distributed consistency: managing asynchrony
4. Fault-tolerance: progress despite failures
Where we’ve been; where we’re headed
1. We need application-level guarantees
2. asynchrony X partial failure = too hard to hide!
We need tools to manage it.
3. Distributed consistency: managing asynchrony
4. Fault-tolerance: progress despite failures
Where we’ve been; where we’re headed
1. We need application-level guarantees
2. asynchrony X partial failure = too hard to hide!
We need tools to manage it.
3. Focus on flow: data in motion
4. Fault-tolerance: progress despite failures
Outline
1. We need application-level guarantees
2. asynchrony X partial failure = too hard to hide!
We need tools to manage it.
3. Focus on flow: data in motion
4. Fault-tolerance: progress despite failures
Outline
1. We need application-level guarantees
2. asynchrony X partial failure = too hard to hide!
We need tools to manage it.
3. Focus on flow: data in motion
4. Backwards from outcomes
Remember
1. We need application-level guarantees
2. asynchrony X partial failure = too hard to hide! We
need tools to manage it.
3. Focus on flow: data in motion
4. Backwards from outcomes
Composition is the hardest problem
A happy crisis
Valentine: “It makes me so happy. To be at
the beginning again, knowing almost
nothing.... It's the best possible time of
being alive, when almost everything you
thought you knew is wrong.”
Notas do Editor
USER-CENTRIC
OMG pause here. Remember brewer 2012? Top-down vs bottom-up designs? We had this top-down thing and it was beautiful.
It was so beautiful that it didn’t matter that it was somewhat ugly
The abstraction was so beautiful,
IT DOESN”T MATTER WHAT”S UNDERNEATH. Wait, or does it? When does it?
We’ve known for a long time that it is hard to hide the complexities of distribution
Focus not on semantics, but on the properties of components: thin interfaces, understandable latency & failure modes. DEV-centric
But can we ever recover those guarantees? I mean real guarantees, at the application level? Are my (app-level) constraints upheld? No? What can go wrong?
FIX ME: joe’s idea: sketch of a castle being filled in, vs bricks
But can we ever recover those guarantees? I mean real guarantees, at the application level? Are my (app-level) constraints upheld? No? What can go wrong?
In a world without transactions, one programmer must risk inconsistency to build a distributed application out of individually-verified components
In a world without transactions, one programmer must risk inconsistency to build a distributed application out of individually-verified components
In a world without transactions, one programmer must risk inconsistency to build a distributed application out of individually-verified components
In a world without transactions, one programmer must risk inconsistency to build a distributed application out of individually-verified components
In a world without transactions, one programmer must risk inconsistency to build a distributed application out of individually-verified components
Meaning: translation
DS are hard because of uncertainty – nondeterminism – which is fundamental to the environment and can “leak” into the results”
It’s astoundingly difficult to face these demons at the same time – tempting to try to defeat them one at a time.
Async isn’t a problem: just need to be careful to number messages and interleave correctly. Ignore arrival order.
Whoa, this is easy so far.
Failure isn’t a problem: just do redundant computation and store redundant data. Make more copies than there will be failures.
I win.
We can’t do deterministic interleaving if producers may fail.
Nd message order makes it hard to keep replicas in agreement
We can’t do deterministic interleaving if producers may fail.
Nd message order makes it hard to keep replicas in agreement
We can’t do deterministic interleaving if producers may fail.
Nd message order makes it hard to keep replicas in agreement
To guard against failures, we replicate.
NB: asynchrony => replicas might not agree
Very similar looking criteria (1 safe 1 live). Takes some work, even on a single site. But hard in our scenario: disorder => replica disagreement, partial failure => missing partitions
Very similar looking criteria (1 safe 1 live). Takes some work, even on a single site. But hard in our scenario: disorder => replica disagreement, partial failure => missing partitions
Very similar looking criteria (1 safe 1 live). Takes some work, even on a single site. But hard in our scenario: disorder => replica disagreement, partial failure => missing partitions
FIX: make it about translation vs. prayer
FIX: make it about translation vs. prayer
FIX: make it about translation vs. prayer
Ie, reorderability, batchability, tolerance to duplication / retry
Now programmer must map from application invariants to object API (with richer semantics than read/write).
Ie, reorderability, batchability, tolerance to duplication / retry
Now programmer must map from application invariants to object API (with richer semantics than read/write).
Convergence is a property of component state. It rules out divergence, but it does not readily compose.
Convergence is a property of component state. It rules out divergence, but it does not readily compose.
Convergence is a property of component state. It rules out divergence, but it does not readily compose.
Convergence is a property of component state. It rules out divergence, but it does not readily compose.
However, not sufficient to synchronize GC.
Perhaps more importantly, not *compositional* -- what guarantees does my app – pieced together from many convergent objects – give?
To reason compositionally, need guarantees about what comes OUT of my objects, and how it transits the app.
*** main point to make here: we’d like to reason backwards from the outcomes, at the level of abstraction of the appplication.
However, not sufficient to synchronize GC.
Perhaps more importantly, not *compositional* -- what guarantees does my app – pieced together from many convergent objects – give?
To reason compositionally, need guarantees about what comes OUT of my objects, and how it transits the app.
*** main point to make here: we’d like to reason backwards from the outcomes, at the level of abstraction of the appplication.
However, not sufficient to synchronize GC.
Perhaps more importantly, not *compositional* -- what guarantees does my app – pieced together from many convergent objects – give?
To reason compositionally, need guarantees about what comes OUT of my objects, and how it transits the app.
*** main point to make here: we’d like to reason backwards from the outcomes, at the level of abstraction of the appplication.
We are interested in the properties of component *outputs* rather than just internal state. Hence we are interested in a different property: confluence.
A confluent module behaves like a function from sets (of inputs) to sets (of outputs)
We are interested in the properties of component *outputs* rather than just internal state. Hence we are interested in a different property: confluence.
A confluent module behaves like a function from sets (of inputs) to sets (of outputs)
We are interested in the properties of component *outputs* rather than just internal state. Hence we are interested in a different property: confluence.
A confluent module behaves like a function from sets (of inputs) to sets (of outputs)
We are interested in the properties of component *outputs* rather than just internal state. Hence we are interested in a different property: confluence.
A confluent module behaves like a function from sets (of inputs) to sets (of outputs)
We are interested in the properties of component *outputs* rather than just internal state. Hence we are interested in a different property: confluence.
A confluent module behaves like a function from sets (of inputs) to sets (of outputs)
We are interested in the properties of component *outputs* rather than just internal state. Hence we are interested in a different property: confluence.
A confluent module behaves like a function from sets (of inputs) to sets (of outputs)
Confluence is compositional: Composing confluent components yields a confluent dataflow
Confluence is compositional: Composing confluent components yields a confluent dataflow
Confluence is compositional: Composing confluent components yields a confluent dataflow
All of these components are confluent! Composing confluent components yields a confluent dataflow
But annotations are burdensome
All of these components are confluent! Composing confluent components yields a confluent dataflow
But annotations are burdensome
A separate question is choosing a coordination strategy that “fits” the problem without “overpaying.” for example, we could establish a global ordering of messages, but that would essentially cost us what linearizable storage cost us. We can solve the GC problem with SEALING: establishing a big barrier; damming the stream.
A separate question is choosing a coordination strategy that “fits” the problem without “overpaying.” for example, we could establish a global ordering of messages, but that would essentially cost us what linearizable storage cost us. We can solve the GC problem with SEALING: establishing a big barrier; damming the stream.
M – a semantic property of code – implies confluence
An appropriately constrained language provides a conservative syntactic test for M.
M – a semantic property of code – implies confluence
An appropriately constrained language provides a conservative syntactic test for M.
Also note that a data-centric language give us the dataflow graph automatically, via dependencies (across LOC, modules, processes, nodes, etc)
Also note that a data-centric language give us the dataflow graph automatically, via dependencies (across LOC, modules, processes, nodes, etc)
Try to not use it! Learn how to choose it. Tools help!
Start with a hard problemHard problem: is my FT protocol work?
Harder: is the composition of my components FT
Point: we need to replicate data to both copies of a replica
We need to commit multiple partitions together
Start with a hard problemHard problem: is my FT protocol work?
Harder: is the composition of my components FT
Examples! 2pc and replication. Properties, etc etc