Although event sourcing (and its sister pattern CQRS) has been gaining traction in recent years, it's still baffling for many engineers attempting to implement it for the first time. While there's plenty of material on the subject, most of it is too basic or theoretical for practical applications, and engineers often end up having to reinvent (or rediscover) suitable approaches and techniques.
This talk focuses on practical aspects of building event-sourced systems, lessons learned from our experience building such systems at Wix. We'll walk through the design and implementation of a relatively simple event-sourced system, covering the event model, underlying persistence model, code layering/factoring and operational considerations.
A talk given at Reversim Summit 2017 in Tel-Aviv, Israel.
1. An Abridged Guide
to Event Sourcing
Tomer Gabel
Reversim Summit
Tel-Aviv 2017
Image: Jack Zalium, “Abriged” via Flickr (CC-BY-ND 2.0)
2. Background
• ADI is a site builder
• It’s pretty nifty
– Huge web application
– … and it even works!
• A cool app isn’t enough
• Sites have to be stored
somewhere!
3. Requirements
• Features:
– Store “site” blobs
– Store version history
– Soft-delete only
• Not required:
– Concurrent editing
Image: ImgFlip
4. 1. WHY CRUD SUCKS
crud Source: Oxford Dictionary
/krəd/
noun informal
1. A substance which is considered unpleasant or disgusting,
typically because of its dirtiness.
2. Nonsense.
5. Mutation Anxiety
• So what’s wrong with CRUD?
• Create
• Read
• Update
• Delete
• It’s all about mutable state
6. Mutation Anxiety
Mutable state is bad.
• Old data is lost
• Hard to debug
• Can’t fix retroactively
• No built-in auditing
Image: David Bleasdale, “Fahrenheit 451” via Flickr (CC-BY 2.0)
7. Mutation Anxiety
“Who told you
you’re allowed to
destroy data?”
-- Greg Young
Image: Code on the Beach speaker profile (source)
8. Not To Scale
• CRUD implies:
– One source of truth
– Reads, writes against
the same store
– Full consistency (ACID)
• Difficult to scale!
Image: Jason Baker, “Bank Vaults under Hotels in Toronto, Ontari” via Flickr (CC-BY 2.0)
9. What’s Holding Us Back?
• Strong consistency
– Assumed with RDBMS
– Often not required!
• Consistency is a product concern
– “Does this have to be 100% fresh?”
– “No? So how stale can this get?”
Images: “Homemade Bread Freshly Baked” via MaxPixel (CC0, above); Tasha, “Homemade Croutons” via Flickr (CC-BY 2.0, below)
10. What’s Holding Us Back?
• Storage cost
– “How long must I hold on to data?”
– “What do you mean forever?!”
• Cost is an operational concern
– “So how much data is there?”
– “How much will it cost to retain?”
Image: Calvin Fraites via Flickr (CC-BY-NC-ND 2.0)
11. 2. A BETTER WAY
“A database is just a view over its
transaction log.”
-- Ancient Vulcan proverb
12. Event Sourcing
• A very simple pattern
• Each entity has its own
event stream
– Events are facts
– Events are immutable
– Events are forever
Opened Account
Deposited $100
Withdrawn $25
Deposited $12
Balance:
$87
13. CQRS
• Writes simply append events
• But reads are projections:
– Full snapshots
– Partial/filter queries
– Views and joins
• Two separate concerns!
– Different schemas
– Different instances
– Even different data stores!
Time
Site 1
Created
Site 1
Updated
Site 1
Updated
Site 2
Created
Site 1
Deleted
Site ID Active
1 false
2 true
“active sites”
14. Better how?
• Tunable consistency
– Full or eventual
– Per use-case!
• Decoupled reads/writes
– Better scale
– Much more flexible!
• Built-in auditing
– Easier to debug
– Replayable!
16. An Event Model
• Our business domain:
a website
• Our use cases:
– “Let’s try this thing out”
– “Adding more stuff”
– “Oops! Revert”
– “OK, I’m outta here”
Created
Updated
Deleted
Restored
17. Storing Events
• What’s in an event?
– The entity (i.e. site) ID
– Some notion of time*
– Event data (e.g. type)
– Some metadata
• We can model this!
Column Type PK? Null?
site_id binary(16) ✔ ✖
version int ✔ ✖
created_at timestamp ✖ ✖
payload blob ✖ ✖
* No, you can’t use timestamps for versioning. More on this later
18. Sketching the API
• Commands are wishes
• Wishes may be rejected:
– Conflicting updates
– Stale version
– Invalid arguments
• Or granted:
– Result: new events!
Command SLA
Create Site Reasonable
Get Latest Very fast
Get Version Reasonable
Update Very fast
Delete Reasonable
Restore Reasonable
20. Hold Your Horses
• Still some concerns:
–Conflicting updates
–Performance
–Operations
• Let’s sort ‘em out
Image: Woodennature, “Slow Down” via Wikimedia Commons (CC-BY 3.0)
21. Conflicting Updates
• Concurrent operations
can conflict
– Rapid site activity
– Multiple open tabs
– Normal network issues
• You have to deal with it
Created
Updated
Updated
Update Update
V0
V1
V2
Time
?
22. Conflicting Updates
• Strategies
– Last-write wins
– Optimistic locking
– Smart resolution/merge
• In our case
– No concurrent editing
– Simple optimistic locking
Created
Updated
Updated
Update Update
V0
V1
V2
Time
?
23. Performance
• Updates are easy
• What about reads?
– Load all events for
site
– Feed into materializer
– Spit snapshot out
• This can hurt.
Image: Dominik Kowanda, “Autobahn” via Flickr (CC-BY-ND 2.0)
24. Performance
• How many events?
– Domain-specific
(for ADI, easily 1000s)
– But events never die
• How big are they?
– Again, domain-specific
– Let’s assume 10-100KB
• Naïve reads will fail.
Image: madaise, “Jenga” via Flickr (CC-BY-ND 2.0)
25. Performance
• Remember CQRS?
– Reads are distinct from writes
– We can use another store!
• We’ll use snapshots
– Immutable
– Ephemeral
– Tunable (space/performance)
Time
V0
V1
V2
V3
V4
V5
V6
V7
…
S3
S6
28. No Silver Bullet
• Event sourcing is awesome
• But it’s a trade-off
– Learning curve
– Increased storage
– More knobs to turn
• Make it wisely!
Image: Ed Schipul, “silver bullet” via Flickr (CC-BY-SA 2.0)
29. Eventual Consistency
• Consistency is a
tradeoff
– Performance
– Complexity
– Cost (tech, support)
• Make it wisely!
Image: Michael Coghlan, “Scales of Justice - Frankfurt Version” via Flickr (CC-BY-SA 2.0)
30. Forethought
• Define target SLAs
– Latency
– Consistency
• Place sanity limits
– Stream size
– Write throttling
• Invest in tooling
– Debug/replay
– Schema evolution
Image: U.S. Army Corps of Engineers, “USACE Base Camp Development Planning Course” via Flickr (CC-BY 2.0)
31. Further Reading
Scaling Event Sourcing for
Netflix Downloads
Phillipa Avery & Robert Reta
How shit works: Time
Tomer Gabel
32. QUESTIONS?
Thank you for listening
tomer@tomergabel.com
@tomerg
http://engineering.wix.com
Sample implementation:
http://tinyurl.com/event-sourcing-sample
This work is licensed under a Creative
Commons Attribution-ShareAlike 4.0
International License.