Event streams as source of truth for applications have risen in popularity. Thoughtworks lists it in the assess phase of their technology radar (Nov 17). At DriveTribe we’re slightly ahead of the curve as we’ve been using this technique in production with Flink since the launch of our platform in Nov 2016. In this talk we discuss how we can use this technique with some functional programming and Flink to successfully design distributed and highly scalable applications and data platforms.
2. DRIVETRIBE
▸ The world biggest motoring
community.
▸ A social platform for petrolheads.
▸ By Clarkson, Hammond and May.
DRIVETRIBE
2
3. DRIVETRIBE
▸ A content destination at the core.
▸ Users consume feeds of content:
images, videos, long-form articles.
▸ Content is organised in
homogenous categories called
“tribes”.
▸ Different users have different
interests and the tribe model allows
to mix and match at will.
DRIVETRIBE
3
4. DRIVETRIBE
DRIVETRIBE ARTICLE
▸ Single article by Richard Hammond.
▸ Contains a plethora of content and
engagement information.
▸ What kind of data do we need in
order to compute an aggregation like
this?
4
12. DRIVETRIBE
QUINTESSENTIAL PREREQUISITES
▸ Scalable. Jeremy Clarkson has 7.2M Twitter followers. Cannot really hack
it and worry about it later.
▸ Performant. Low latency is key and mobile networks add quite a bit of it.
▸ Flexible. Almost nobody gets it right the first time around. The ability to
iterate is paramount.
▸ Maintainable. Spaghetti code works like interest on debt.
12
13. DRIVETRIBE
THREE TIER APPROACH
▸ Clients interact with a fleet of stateless servers
(aka “API” servers or “Backend”) via HTTP
(which is stateless).
▸ Global shared mutable state (aka the Database).
▸ Starting simple: Store data in a DB.
▸ Starting simple: Compute the aggregated views
on the fly.
13
15. DRIVETRIBE
▸ (6 queries per Item) x (Y items per
page)
▸ Cost of ranking and personalisation.
▸ Quite some work at read time.
▸ Slow. Not really Performant.
▸ Also very expensive to scale.
DRIVETRIBE FEED OF ARTICLES
15
16. DRIVETRIBE
WRITE TIME AGGREGATION
▸ Compute the aggregation at write
time.
▸ Then a single query can fetch all the
views at once. That scales.
16
17. DRIVETRIBE
▸ Compute the aggregation at write
time.
▸ Then a single query can fetch all the
views at once. That scales.
WRITE TIME AGGREGATION
17
18. DRIVETRIBE
WRITE TIME AGGREGATION
▸ Compute the aggregation at write
time.
▸ Then a single query can fetch all the
views at once. That scales.
18
19. DRIVETRIBE
WRITE TIME AGGREGATION
▸ Compute the aggregation at write
time.
▸ Then a single query can fetch all the
views at once. That scales.
19
20. DRIVETRIBE
▸ Compute the aggregation at write
time.
▸ Then a single query can fetch all the
views at once. That scales.
WRITE TIME AGGREGATION
20
23. DRIVETRIBE
WRITE TIME AGGREGATION - EVOLUTION
▸ sendNotification
▸ updateUserStats.
▸ What if we have a cache?
▸ Or a different database for search?
23
24. DRIVETRIBE
▸ Requests may need to fan out and
update multiple views.
▸ Eg: Authors can update their
names.
▸ This needs to be propagated to all
materialised views.
WRITE TIME AGGREGATION - FAN OUT
24
25. DRIVETRIBE
WRITE TIME AGGREGATION
▸ A simple user action is triggering an
unbound sequence of side effects.
▸ Most of which need network IO.
▸ Many of which can fail.
25
26. DRIVETRIBE
▸ What happens if one of them fails?
What happens if the server fails in
the middle?
▸ Inconsistent.
ATOMICITY
26
Atomicity?
27. DRIVETRIBE
▸ Concurrent mutations on a global
shared state entail race conditions.
▸ State mutations are destructive and
can not be (easily) undone.
▸ A bug can corrupt the data
permanently.
CONCURRENCY
27
Concurrency
28. DRIVETRIBE
▸ Model evolution becomes difficult.
Reads and writes are tightly
coupled.
▸ Migrations are scary.
▸ This is neither Extensible nor
Maintainable.
ITALIAN PASTA
28
Extensibility
29. DRIVETRIBE
▸ Let’s take a step back and try to decouple things.
▸ Clients send events to the API: “John liked Jeremy’s post”, “Maria updated
her profile”
▸ Events are immutable. They capture a user action at some point in time.
▸ Every application state instance can be modelled as a projection of those
events.
DIFFERENT APPROACH
29
30. DRIVETRIBE
▸ Persisting those yields an append-
only log of events.
▸ An event reducer can then produce
application state instances.
▸ Even retroactively. The log is
immutable.
▸ This is event sourcing.
30
31. DRIVETRIBE
▸ The write-time model (command
model) and the read time model
(query model) can be separated.
▸ Decoupling the two models allows to
optimise each path independently.
▸ This is known as Command Query
Responsibility Segregation aka
CQRS.
31
39. DRIVETRIBE
EVENT SOURCING APPROACH
39
Store Like Event
ArticleStatsReducer
NotificationReducer
UserStatsReducer
Sky Is the limit
Get Articles
Like!!
Extensibility?
Maintainability?
44. DRIVETRIBE
APACHE KAFKA
▸ Distributed, fault-tolerant, durable and fast append-only log.
▸ Can scale the thousands of nodes, producers and consumers.
▸ Each business event type can be stored in its own topic.
▸ The source of truth.
44
49. DRIVETRIBE
EVENT SOURCING IN PRACTICE
49
Store Raw events Consume raw events
Produce aggregated
views
Retrieve aggregated
views
50. DRIVETRIBE
EVENT SOURCING IN PRACTICE
50
Store Raw events Consume raw events
Produce aggregated
views
Retrieve aggregated
views
Websocket connection
51. DRIVETRIBE
EVENT SOURCING IN PRACTICE
51
Store Raw events Consume raw events
Produce aggregated
views
Retrieve aggregated
views
Stateful
52. DRIVETRIBE
EVENT SOURCING IN PRACTICE
52
Store Raw events Consume raw events
Produce aggregated
views
Retrieve aggregated
views
55. DRIVETRIBE
EXAMPLE: COUNTING LIKES
▸ Users can “bump” a post if they like it
▸ Count how many users like a post
▸ Design goals:
▸ Modularity (DRY principle)
▸ Testability
▸ Conciseness
55
56. DRIVETRIBE
ARCHITECTURAL TRADE-OFFS
▸ Completely asynchronous
architecture
▸ Client requests are eventually
consistent
▸ This can cause duplicate events
despite exactly once semantics
▸ How do we solve this problem?
56
Like event
Like reducer
Persister
57. DRIVETRIBE
COUNTING LIKES: PROBLEM REPRESENTATION
57
▸ Like event
▸ Stream of likes
▸ Accumulate likes with a stateful
operator
▸ Count the number of users who
like a post
▸ Emit state to be persisted in
database
59. DRIVETRIBE
FIRST ATTEMPT: COUNTING WITH LONGS
▸ Transform LikeEvent into PostState
▸ Key by Post Id
▸ Combine PostState
▸ Emit new PostState
59
60. DRIVETRIBE
FIRST ATTEMPT: COUNTING WITH LONGS
▸ What happens when we have
duplicate events?
▸ Each event will be counted
separately, leading to wrong
results
▸ We want to count the number of
distinct users who liked a post
▸ Do we know of a data-structure
which handles deduplication?
60
61. DRIVETRIBE
SECOND ATTEMPT: COUNTING WITH SETS
▸ Replace Long with Set of User
Ids
▸ Set union is idempotent
▸ Eliminates duplicates
▸ Like count is modeled as the
size of the Set
61
Example of set idempotence
62. DRIVETRIBE
GENERALIZING OUR SOLUTION
▸ Two similar approaches
▸ Both have a combine method: combine or |+|
‣ What are the desired properties?
1. Associativity: (a |+| b) |+| c == a |+| (b |+| c)
=> data parallelism
2. Idempotence: a |+| b |+| b == a |+| b
=> deduplication
62
63. DRIVETRIBE
BAND ALGEBRA TO THE RESCUE
▸ The Band algebra consists of two things
1. A type with an API (typeclass, interface)
2. A set of laws
1. Associativity
2. Idempotence
63
64. DRIVETRIBE
POSTSTATE BAND
▸ Assume the Ids of two
PostStates are the same
▸ PostState with the combine
operator forms a Band algebra
▸ |+| is shorthand for combine
64
66. DRIVETRIBE
IMPROVING OUR SOLUTION
▸ Band satisfies the properties to solve our problem
▸ Set has linear space complexity
▸ Can we use a better Band?
▸ HyperLogLog is an approximation algorithm with constant space complexity
which forms a Band
66
68. DRIVETRIBE
MODULARITY
▸ Focus on properties of the solution
▸ Replace with a different implementation with the same properties
▸ We have achieved one of our design goals: modularity
68
70. DRIVETRIBE
TESTABILITY
▸ Property based testing (à la QuickCheck / ScalaCheck)
▸ Generate random inputs
▸ Verify invariants / laws / properties
▸ Automatically check laws with the Typelevel Cats Laws library
70
71. DRIVETRIBE
MORE ALGEBRAS
▸ Cats library in Scala provides a
complete algebra hierarchy
▸ Monoid: represent empty values
▸ Semilattice: aggregate out of
order events (CRDTs)
71
72. DRIVETRIBE
ADVANCED
▸ Automagically derive algebra instances for any case class with some type-
level trickery
▸ Merge multiple streams together with Coproducts and Semilattices
▸ Write an analytics engine based on a simple Monoid algebra
“An Algebra for Distributed Big Data Analytics” - Leonidas Fegaras
72