These are the slides I presented at the Melbourne 2017 YOW! CTO Summit. The talk followed Culture Amp's steps and miss-steps on our journey to micro-services, and how we came to find CQRS and Event Sourcing fantastic tools.
3. Our journey Our meandering path to CQRS & event sourcing
• Attempt 1 - Asynchronous micro-services
Event Log the monolith and introduce query services
• Attempt 2 - Greenfield systems of record
Rebuild part of the monolith as a stand alone command side service
• Attempt 3 - Refactor the monolith first
And start with the least depended upon aggregates
4. Attempt 1:
Asynchronous
Micro-services
Why micro-services?
• Too many developers!
• Monolith was becoming too complex.
• Want teams to control their own assets.
• Flexibility to introduce fit for purpose technologies.
(for example, elastic search for our comments search)
• Scale different parts of the app independently.
(for example, capture has a very different profile)
5. Attempt 1:
Asynchronous
Micro-services
Why asynchronous?
• Resilience is architected into the platform.
If a micro-service is down, all depended micro-services remain available
and are eventually consistent. Also protects yourself from self inflicted DOS.
• Ability to replay history for consuming services
Much easier to refactor, or to introduce new services.
• Decoupled architecture enabling platform flexibility
Consumers don’t necessarily need to know where the producers are deployed
7. What went
wrong
• Plagued with bad data issues as we tried to convert
our current production database to a stream of events
• Ongoing source-of-truth issues between the
monolith persistence layer and the synthesised event log.
• Events were generated at the wrong business points
intended for consumption but ironically problematic for consumers
• Events were too coarse-grained and slow to process.
• Some of the events weren't real business events.
• Multiple event streams - one for each event type - hurt.
8. What we learned Asynchronous micro-services are the
right long term aim.
But, don’t start with event consumers,
start with event producers.
Event logging is not event sourcing!
9. What is
Event Sourcing?
Add
Employee A
Add
Employee B
Remove
Employee A
Add
Employee C
Add
Employee B
Add
Employee C
• Store the business events as the source of truth
• Can answer questions you don’t yet know you have.
• Legal, accounting, traditional professions have used
event sourcing for centuries.
10. Every method should either be
a command that performs an action, or
a query that returns data to the caller, but
not both.
In other words,
asking a question should not
change the answer.
— Bertrand Meyer (designer of Eiffel language) on CQS
“
12. CQRS &
event sourcing
architecture
Client
Command
Handler
Aggregate
event
store
Projection
Query
Handler
Eventually
consistent!
13. Attempt 2:
Greenfield
systems of record
Client
Rails App
(Murmur)
Query Server
DB
Reporting
Service
Event
Store
Employee
SoR
Projector
Event
Log
14. What worked
well?
• CQRS and event sourcing helped us be clear about the
boundaries of context.
• The new service owns the validation, so bad data issues
in accounts we migrated were found and fixed upfront
• We could treat the monolith as one massive projection
minimising work required on the Query side, and UI.
• Removed our source of truth problems.
Domain models can be deterministically built from events, not vice versa.
• We could incrementally migrate accounts
15. What didn’t
work well?
• Required a leap of faith for the team to agree to try.
• Slow to develop as we were learning a new paradigm,
and rolling our own event sourcing system.
• Eventual consistency hurts:
• Upfront we needed to solve how to update
the monolith in a reasonable timeframe.
• We needed to refactor parts of the UI to cope
with the eventual consistency.
• Disaster recovery becomes a distributed problem.
16. With all the troubles of evented architecture -
it is a blessing when "shit hits the fan”.
I don't know if in the past we would even be
able to tell the customers affected by the issue.
— Unsolicited feedback from our
Tech Lead for the Employee Service
“
17. 1. Start with the Command Side.
Treat your existing application like one big projection.
2. Tackle a single Aggregate at a time.
No need for a big-bang migration.
3. Generate UUIDS in the legacy monolith first.
These will help you link the migration scripts back.
4. Start by synchronously updating your existing application.
Your legacy UI will thank you for it.
5. When all writes are through CQRS Commands, refactor
the UI to use new dedicated asynchronous projections.
6. When nothing still uses the legacy domain model,
remove it and remove it’s synchronous projector.
Attempt 3:
Refactor the
monolith first
22. Key learnings 1. Event Logging is not event sourcing
and just leads to source of truth confusion.
2. CQRS and event sourcing can help you move to
micro-services, but start by refactoring the monolith
3. If you choose to event source your monolith, take
one step at a time, and focus on the command side.