AWS Data Lake / Lake House + Confluent Cloud for Serverless Apache Kafka. Learn about use cases, architectures, and features.
Data must be continuously collected, processed, and reactively used in applications across the entire enterprise - some in real time, some in batch mode. In other words: As an enterprise becomes increasingly software-defined, it needs a data platform designed primarily for "data in motion" rather than "data at rest."
Apache Kafka is now mainstream when it comes to data in motion! The Kafka API has become the de facto standard for event-driven architectures and event streaming. Unfortunately, the cost of running it yourself is very often too expensive when you add factors like scaling, administration, support, security, creating connectors...and everything else that goes with it. Resources in enterprises are scarce: this applies to both the best team members and the budget.
The cloud - as we all know - offers the perfect solution to such challenges.
Most likely, fully-managed cloud services such as AWS S3, DynamoDB or Redshift are already in use. Now it is time to implement "fully-managed" for Kafka as well - with Confluent Cloud on AWS.
Building a central integration layer that doesn't care where or how much data is coming from.
Implementing scalable data stream processing to gain real-time insights
Leveraging fully managed connectors (like S3, Redshift, Kinesis, MongoDB Atlas & more) to quickly access data
Confluent Cloud in action? Let's show how ao.com made it happen!
Translated with www.DeepL.com/Translator (free version)
8. This is a fundamental paradigm shift...
8
Infrastructure
as code
Data in motion
as continuous
streams of events
Future of the
datacenter
Future of data
Cloud
Event
Streaming
9. An Event Streaming Platform is the
Underpinning of an Event-driven Architecture
9
MES
ERP
Sensors
Mobile
Customer 360
Real-time
Alerting System
Data warehouse
Producers
Consumers
Streams of real time events
Stream processing
apps
Connectors
Connectors
Stream processing
apps
Supplier
Alert
Forecast
Inventory Customer
Order
10. Car Engine Car Self-driving Car
Confluent Completes Apache Kafka
11. Truly CLOUD-NATIVE experience
at the edge, in the data center,
and in the cloud
Confluent Cloud
A fully managed, cloud-native service for Apache Kafka
Confluent Platform
A complete, enterprise-grade distribution of Apache Kafka
Confluent for
Kubernetes
Ansible
Playbooks
Packages:
Docker, RPMs,
Tarball
Public Cloud Workloads Edge and On-Premise Workloads
On Kubernetes On VMs / Bare Metal
Wavelength
12. STREAM
PROCESSING
CONNECTORS
Example Architecture for Event Streaming
ksqlDB
KStreams
Processing Data in Motion with Confluent Cloud on AWS
Dashboard
Oracle
DB
Oracle
CDC
CONNECTOR
Salesforce CDC
CONNECTOR
Salesforce
Source / Sink
CONNECTOR
Fraud Detection App
13. Context-specific Customer 360
13
Electrical retailer
Hyper-personalized online retail experience,
turning each customer visit into a one-on-one
marketing opportunity
Correlation of historical customer data with real-
time digital signals
Maximize customer satisfaction and revenue
growth, increased customer conversions
https://www.confluent.io/customers/ao/
14. Ingest & Process
Capture event streams with a consistent data structure using
Schema Registry, develop real-time ETL pipelines with a lightweight
SQL syntax using ksqlDB & unify real-time streams with batch
processing using +100 Confluent Connectors
Derive insights from data in real-time
Mobile
Web
IoT
Data store
AWS & On-prem
Amazon
S3
S3 Sink
ANALYZE
Amazon
Redshift
AWS Lake
Formation
Amazon
Athena
Redshift Sink
TRANSFORM
Amazon
EMR
AWS Data
Pipeline
AWS
Glue
Source
connectors
Store & Analyze
Stream data with Confluent pre-built Connectors into your
AWS data lake or data warehouse to execute queries on vast
amounts of streaming data for real-time and batch analytics
VISUALIZE
Amazon
Elasticsearch
Schema
Registry
ksqlDB
Events
Real-time analytics
15. Serverless integration
Connect existing and apps & data stores in a repeatable way without
having to manage- Apache Kafka, Schema Registry to maintain
app compatibility, ksqlDB to develop real-time apps with SQL syntax
and Connect for effortless integrations with Lambda & data stores
AWS serverless platform
Stop provisioning, maintaining or administering servers for
backend components such as compute, databases and
storage so that you can focus on increasing agility and
innovation for your developer teams
Increase developer agility & speed of innovation
Apps
Microservices
ksqlDB
Schema
Registry
COMPUTE
AWS
Lambda
Data stores
REST Proxy
& Clients
Source
Connectors
Lambda
Sink
DATA STORES
Amazon
DynamoDB
Amazon
Aurora
STORAGE
Amazon
S3
S3 Sink
ANALYTICS
Amazon
Athena
Amazon
Redshift
Serverless app integration
16. Accelerate modernization from on-prem to AWS
Redshift Sink
Lambda Sink
AWS Direct
Connect
LEGACY EDW
MAINFRAME
LEGACY DB
JDBC / CDC
connectors
Connect
Leverage +100 Confluent pre-built connectors to
continuously bring valuable data from existing
services on-prem including enterprise data
warehouse, databases and mainframes
Modernize
Increase agility in getting applications to market
and reduce TCO when freeing up resources to
focus on value generating activities and not in
managing servers
On-prem AWS Cloud
Bridge
Hybrid cloud streaming
with consistent, event-
driven architecture for
modern apps
On-prem to AWS modernization
Amazon Athena
AWS Glue
SageMaker
Lake Formation
Amazon
DynamoDB
Amazon
Aurora
S3 Sink
Data Streams
Apps
ksqlDB
Cluster
Linking
17. Low Latency 5G Use Cases
with AWS Wavelength (based on AWS Outposts) and Confluent
18. Global Event Streaming
Streaming Replication between Clusters across Cloud, On-Prem and Edge
Bridge to Databases, Data Lakes, Apps, APIs, SaaS
Aggregate Small Footprint
Edge Deployments with
Replication (Aggregation)
Simplify Disaster Recovery
Operations with
Multi-Region Clusters
for RPO=0 and RTO~0
Stream Data Globally with
Replication and Cluster Linking
18
19. Omnichannel Retail
Time
P
C3 C2
C1
Sales Talk on site in
Car Dealership
Right now
Location-based
Customer Action
Customer 360
(Website, Mobile App, On Site in Store, In-Car)
Car Configurator
10 and 8 days ago
Context-specific
Marketing Campaign
90 and 60 days ago
AWS
Lambda
21. CRM
3rd party
payment
provider
Context-specific
real-time upsell
Customer data
Payment processing and
fraud detection as a service
Manager
Get report
API
Customer Customer
Customer
data
Train
schedule
Payment
data
Loyalty
information
Streams of real time events
Customer
data
Train
schedule
Payment
data
Loyalty
information
Streams of real time events
Customer
data
Train
schedule
Payment
data
Loyalty
information
Streams of real time events
Hybrid Retail Architecture
22. Point of Sale
(POS) Loyalty
System
Local Inventory
Management
Payment Discount
Customer
data
Train
schedule
Payment
data
Loyalty
information
Streams of real time events
Global Inventory
Management
Event Streaming at the Edge in the
Smart Retail Store
Item Availability
26. Confluent Schema Registry
The de-facto schemas metadata repository for data in motion
Schema
Registry
Producer
Kafka
serializer
Kafka
deserializer
Consumer
Kafka
3. Produce message
with schema ID
5. Consume message
with schema ID
6. Ask for schema
given schema ID
7. Return
schema
Invalid
message
Invalid
message
4. Is this a valid
schema ID?
1. Register schema
2. Return
schema ID
27. Confluent Cloud Data Governance
Data Quality
Increase data trust
● Schemas management UI
● Broker-side schema ID validation
Data Catalog
Classify, organize, discover
● Search and discover schemas
metadata
● Manage data classifications
● Classify schemas with tags
Data Lineage
Turn data visibility on
● Visualize complex data in
motion pipelines
● Audit data movement across
systems
NOW IN EARLY-ACCESS
27
28. Car Engine Car Self-driving Car
Confluent Completes Apache Kafka
29. Confluent Cloud + : Accelerate Business Value for Customers
Topline Impacting New
Experiences
● Event-driven & real-time
● Unify data across org. w/ Kafka
data fabric (Schema Reg,..)
● AWS Analytics, Redshift, ML
connectors
Mitigate Risk
● Higher Service Quality &
Resilience with 99.95% SLA
● Deep Kafka expertise & innovation
● Elastic billing/pricing
Developer Agility
● Focus on innovation (not data
infrastructure)
● Leverage full Kafka OSS
ecosystem + AWS services
Faster Time to Market
● ~50-75% faster time to market*
● Streamline hybrid cloud
migration with no complex lift-n-
shift
● Maintain business continuity
Lower Kafka TCO
● ~25-50% lower TCO *
● GBps-scale & fast deployments
for global expansion
● Deploy Kafka at scale in 1 week
Maximize ROI
● ~200% ROI per Forrester study
● Save 10s of $Ms with legacy
offload to AWS with Confluent
Replicator
* For customers that don’t already have Kafka based system in-market
* TCO assessment to be analyzed for specific customer scenarios