James Gollan, Confluent, Senior Solutions Engineer
From digital banking to industry 4.0 the nature of business is changing. Increasingly businesses are becoming software. And the lifeblood of software is data. Dealing with data at the enterprise level is tough, and their have been some missteps along the way.
This session will consider the increasingly popular idea of a 'data mesh' - the problems it solves and, perhaps most importantly, how an event streaming platform forms the bedrock of this new paradigm.
Recording to be available cnfl.io/meetup-hub
https://www.meetup.com/KafkaMelbourne/events/277076626/
Streamlining Python Development: A Guide to a Modern Project Setup
Domain Driven Data: Apache Kafka® and the Data Mesh
1. Domain Driven Data
Apache Kafka and the Data
Mesh
James Gollan, Senior Solutions Engineer at Confluent
2. Why are we talking about Data
Mesh at a Kafka meetup?
3. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Producer Guarantees
P
Broker 1 Broker 2 Broker 3
Topic1
partition1
Leader Follower
Topic1
partition1
Topic1
partition1
Producer Properties
acks=all
min.insync.replica=2
ack
4. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
What problem are we trying to solve?
4
Monolithic datastores
Centralised processing and governance
Bottlenecks for processing and analysing data for the business
Data scientists don’t have a full understanding of the data’s context
Results in a data swamp
6. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Data Mesh
6
Concept first spoken about by Zhamak Dehghani from ThoughtWorks
Break apart the ‘data monolith’
Treats domains as first class citizens when dealing with data
Domains encouraged to stop treating data as an asset, and to start treating data as a product
Emphasis on a self-service data platform
Federated governance of organisational data
7. This seems kinda familiar - where
have I heard this before....
Could it be microservices?
8. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Microservices and Domain Driven Design
8
This problem has been solved before for the Monolithic application
This was broken down into microservices
Creation of these microservices emphasises business domains
The bounded context provides the public interfaces for the domain
Within the bounded context domain specific language and business logic is used
10. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. 10
12. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Kafka’s role in the data mesh
Kafka
Kafka facilitates the data mesh by acting as a central hub for events
Infinite storage in Kafka allows it to be used as the source of truth within the organization
12
13. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Kafka’s role in the data mesh
Connect
Domain based connect workers allow domain owners to integrate source and sink connectors
One of these sinks might be a data warehouse where the domain can conduct analysis on
their data sets, potentially combined with organisational wide data sets from other domains
13
14. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Kafka’s role in the data mesh
ksqlDB
Distributed ksqlDB allows domains to run their own real-time stream processing
This may be used to prepare data from multiple topics for publication across the organisation
It may also be used for advanced stream processing, such as real-time fraud detection
14
15. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Kafka’s role in the data mesh
Schema registry
With event driven architecture the schema is the API
Schema registry ensures consistency in event structure, and enables forward and backward
compatibility across schema changes
Is may be extended to provide more data governance features, such as field level tagging,
data catalog functionality etc.
15
16. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Converters