SlideShare uma empresa Scribd logo
1 de 47
Baixar para ler offline
©2022 Databricks Inc. — All rights reserved
Kafka with Spark Structured
Streaming and Beyond:
Building Real-Time Data
Processing and Analytics
with Databricks
Emma Liu
Staff Product Manager, Databricks
Nitin Saksena
Sr. Director of Omni Channel Architecture, Albertsons
Companies
Ram Dhakne
Staff Solutions Engineer, Confluent
©2022 Databricks Inc. — All rights reserved
Databricks
The Lakehouse Company
Global adoption
Over 7000 customers, from F500 to unicorns
Inventor and pioneer of the
data lakehouse
Gartner recognized leader in both
● Database Management Systems
● Data Science and Machine Learning
Platforms
Creator of highly successful OSS data
projects: Delta Lake, Apache Spark, and
MLflow
Raised over $3B in investment
3000+ employees across the globe
©2022 Databricks Inc. — All rights reserved 3
Simple
Unify your data warehousing and AI
use cases on a single platform
Multicloud
One consistent data platform across clouds
Open
Built on open source and open standards
Databricks
Lakehouse Platform
Lakehouse Platform
Data
Warehousing
Data
Engineering
Data Science
and ML
Data
Streaming
All structured and unstructured data
Cloud Data Lake
Unity Catalog
Fine-grained governance for data and AI
Delta Lake
Data reliability and performance
©2022 Databricks Inc. — All rights reserved
Data Governance Data
Warehousing
Data
Engineering
Data Science
and ML
Data
Streaming
BI and Dashboards Machine Learning Data Science
Consulting &
SI Partners
Databricks thrives within your modern data stack
Data Pipelines
Unity Catalog
Delta Lake
Cloud Data Lake
Data Ingestion
©2022 Databricks Inc. — All rights reserved
Speaker: Emma Liu
Benyue (Emma) Liu
Staff Product Manager – Data Ingestion @ Databricks
Previous Experiences: Product Manager at TigerGraph,
MarkLogic; Software Engineer at Oracle
Domain Focus: Cloud Computing, DBaaS, Multi-Model
& Graph Databases, Developer/DBA Tools,
Containerization
MS in Technology and Policy Program at MIT
BS in General Engineering at Harvey Mudd College
©2022 Databricks Inc. — All rights reserved
Spark Structured Streaming
Project Lightspeed
Topic
All structured and unstructured data
Cloud Data Lake
Unified Governance and Security
Data Reliability and Performance
Lakehouse Platform
Data
Warehousing
Data
Engineering
Data Science
and ML
Data
Streaming
Developer Experience
Kafka and Spark Structured Streaming
©2022 Databricks Inc. — All rights reserved
Continuously generated and unbounded data
Machine &
Application Logs
Clickstreams Mobile &
IoT Data
DB Change
Data Feeds
Application
Events
Streaming Data
The vast majority of the data in the world is streaming data!
7
©2022 Databricks Inc. — All rights reserved
Stream Processing is
continuous and unbounded
Stream Processing
8
Traditional Processing is
one-off and bounded 1
Data Source
2
Processing
Data Source Processing
©2022 Databricks Inc. — All rights reserved
Stream Processing in the Lakehouse
Databases
Streaming
Sources
Cloud Object
Stores
SaaS
Applications
NoSQL
On-premises
Systems
Data Sources
BI
Reporting
Dashboarding
Data Science
Machine
Learning
Operational
Apps
Data
Consumers
Transform and improve quality
Stream/batch processing
Spark, Structured Streaming
(Photon)
Ingest
Auto Loader
Serve /
Share
Orchestrate all processes
Monitor & Govern
Delta Live Tables
Jobs
©2022 Databricks Inc. — All rights reserved
Technical Advantages
A more intuitive way of capturing and processing
continuous and unbounded data
Lower latency for time sensitive applications and use cases
Better fault-tolerance through checkpointing
Higher compute utilization and scalability through
continuous and incremental processing
10
©2022 Databricks Inc. — All rights reserved
Business Benefits
11
BI and SQL
Analytics
Fresher
and faster
insights
Quicker and
better business
decisions
Data
Engineering
Sooner
availability of
cleaned data
More business
use cases
Data Science
and ML
More frequent
model update
and inference
Better model
efficacy
Event Driven
Application
Faster customized
response
and action
Better and
differentiated
customer
experience
©2022 Databricks Inc. — All rights reserved 12
Project Lightspeed
©2022 Databricks Inc. — All rights reserved
Project Lightspeed
Faster and simpler stream processing
Predictable Low Latency
Target reduction in tail
latency by up to 2x
Enhanced Functionality
Advanced capabilities for
processing data with new
operators and easy to use APIs
Operations & Troubleshooting
Simplifying deployment,
operations, monitoring, and
troubleshooting
Connectors & Ecosystem
Improving ecosystem support for
connectors, authentication &
authorization features
13
©2022 Databricks Inc. — All rights reserved 14
Apache Kafka
&
Apache Spark Structured
Streaming
©2022 Databricks Inc. — All rights reserved
Apache Kafka and Apache Spark Structured Streaming
Apache Kafka
Connector for
Spark Structured
Streaming Structured
Streaming
End-to-end Open Source Pipeline
15
©2022 Databricks Inc. — All rights reserved
Apache Kafka Connector for Spark Structured Streaming
Apache Kafka
Connector for
Spark Structured
Streaming Structured
Streaming
©2022 Databricks Inc. — All rights reserved
Apache Kafka and Databricks Auto Loader
Auto
Loader
Cloud Storage
(S3, ADLS, GCS)
Structured
Streaming
©2022 Databricks Inc. — All rights reserved
Confluent Cloud - Databricks Delta Lake Sink Connector
Cloud Storage
Delta Lake
Sink Connector
©2022 Databricks Inc. — All rights reserved
Latency
Convenience Cost
Choose the right
latency, convenience,
and cost tradeoff for
each specific use case
19
©2022 Databricks Inc. — All rights reserved 20
Supercharge Lakehouse with
Streaming Data
Databricks and Confluent
©2022 Databricks Inc. — All rights reserved 21
Albertsons
Building Real-Time Data
Processing and Real time Analytics
Nitin Saksena
Senior Director of Omni Channel Architecture, Albertsons Companies
ALBERTSONS COMPANIES 23
Albertsons Companies is one of the largest
food and drug retailers in the country.
Locally great, Nationally strong
ALBERTSONS COMPANIES 24
Our Purpose
To bring people together around the joys of
food and to inspire well-being
ALBERTSONS COMPANIES 25
Albertsons – Locally great, Nationally strong
Environment, Social and Governance (ESG) efforts are focused on: Planet, People, Product and Community
ALBERTSONS COMPANIES 26
Nitin Saksena
Senior Director of Omni Channel Architecture, Albertsons
Nitin is leading the Enterprise Architecture Team across eCommerce, Digital Shopping Experience,
Marketing and Media Collective, Merchandizing and Pharmacy. He has 19+ years of experience in
industry predominantly in Retail. Nitin has architected several key initiatives in Loyalty, Offer
Execution and Redemption, Order Management and Customer Support. His areas of interest is to
solve complex business problems with simple solutions.
ALBERTSONS COMPANIES 27
Technology
Transformation and
Strategic Priorities
ALBERTSONS COMPANIES 28
Technology Transformation - Key Tenets
Strengthen talent with an agile and high energy Global Team
High Performance Team
Build easy frictionless &
differentiated Omni-
Channel Experiences
Omni-Channel
Experiences
Create agile, scalable and
reliable technology with
Platform Modernization
Modern Technology
Platform
Maximize value for our
customers and employees
with intelligent data
Intelligent Data
Leverage technology to
drive productivity in the
enterprise
Productivity
Strong Technology Foundation to Support Accelerated Business Growth
ALBERTSONS COMPANIES 29
Technology Strategic Priorities
DRIVERS
Intelligent Agility Cognitive Realtime Connected Autonomous
Foundation
Modernization
& Migration to Public
Cloud
Enterprise
Data Platform
Network
Modernization
InfoSecurity
Global People
& Process Optimization
Competitive Digital &
eCommerce
Optimized
Promotions
Forecasting &
Replenishment
Run Great Stores
Smart Stores
Supply Chain of
the Future
Scaled &
Differentiated
eCommerce
Strategic Data
Health &
Wellness
Expansion
Loyalty
Programs
Business
Priorities
EASY & FRICTIONLESS EXCITING & INNOVATIVE DIGITAL PENETRATION LONG TERM CUSTOMER
RELATIONSHIPS
ALBERTSONS COMPANIES 30
Architecture Deep Dive
ALBERTSONS COMPANIES 31
Real Time Data Processing using Databricks
Distributing Offers and
Customer Clips to each store
in Near Real Time
Distribution of Offers to the relevant
Stores
Distribution of Clips in near real time to
the frequently shopped store.
An improved Store checkout
experience.
Real Time Business
Reporting and Dashboard
Supply Chain Order Forecast
Warehouse Order Management and
Delivery
Labor Forecasting and Management
Demand planning
Inventory management
Inventory updates from 2200
stores to the cloud services
in near real time
Maintain a visibility in the health of the
Stores
Well Managed Out of Stock and
Substitution recommendations for
Digital Customers
Ingesting Transactions in
Near Real Time to Data Hub
Generate Near Real Time Dashboards for
Associates and Business
Feed Data in Near Real time to Data Models
for in-session Hyper Personalization
ALBERTSONS COMPANIES 32
Real Time Data Processing Using Databricks
Real Time Data Processing using
Databricks in an Event Driven
Architecture
Problem Statement
• Merchandising Team creates Offers and Promotions.
• The Offers are created at Banner, Division or Individual Store Levels
• These Offers are to be presented to our customers.
• More importantly these offers are sent to the stores across the country.
• Customers view these offers on digital channel and clip them.
• These Clips are sent to frequently shopped stores in real time.
• Clipped Offers are applied on the transactions online and in stores
• Stores and Online system are notified of the redeemed offers
Stores
Offers Clips
Offers
ALBERTSONS COMPANIES 33
Real Time Data Processing using Databricks
Offer Management System
API API
Master Data
Stores
Offer Expansion and Distribution
Clips Distribution to relevant store
ALBERTSONS COMPANIES 34
Real Time Analytics Using Databricks
API
Operational
Database
Event
Database
Extract Metadata out of Business Events
Cross Reference the Metadata with other Event
Metadata
Drive the Metrics out of Event Cross References
Derived Metrics
Dashboard
Application
ALBERTSONS COMPANIES 35
Future Use Case with Real Time Data Processing
Hyper Personalization of
Content (Offers, Recipes)
In session
recommendation using
runtime data models
Next Best Action
Reduced Out of Stock
Pick and Improve
Substitution By using real
time inventory signals
©2022 Databricks Inc. — All rights reserved 36
Databricks &
Confluent Partnership
©2022 Databricks Inc. — All rights reserved
Speaker: Ram Dhakne
Ram Dhakne
Ram works as a Staff Solutions Engineer at Confluent. He has a
wide array of experience in NoSQL databases, Filesystems,
Distributed Systems and Apache Kafka.
He supports various industry verticals ranging from large
financial services, retail, healthcare, telecom, and utilities
companies towards their digital modernization journey.
His current interests are in helping customers adopt realtime
event streaming technologies using Kafka. As a part-time hobby,
he has authored two children’s books.
©2022 Databricks Inc. — All rights reserved
Modern data platforms are powering multiple real-time
apps across industries
Retail
Drive consumer
analytics & streamline
operations
Healthcare
Provide patients better
choices & doctors
better insight
Banking
Combat fraud &
remain competitive
Automotive
Amplify vehicle
intelligence & safety
Inventory
Management
Personalized
Promotions
Product
Development
& Introduction
Sentiment
Analysis
Supply chain
logistics
Systems of
Scale for High
Traffic Periods
Connected
Health
Records
Data
Confidentiality
& Accessibility
Dynamic Staff
Allocation
Optimization
Integrated
Treatment
Proactive
Patient Care
Real-Time
Monitoring
Capital
Management
Early-On
Fraud
Detection
Market Risk
Recognition &
Investigation
Preventive
Regulatory
Scanning
Real-Time
What-If
Analysis
Trade Flow
Monitoring
Advanced
Navigation
Environmental
Factor
Processing
Fleet
Management
Predictive
Maintenance
Threat
Detection &
Real-Time
Response
Traffic
Distribution
Optimization
©2022 Databricks Inc. — All rights reserved
©2022 Databricks Inc. — All rights reserved
Multiple deployment models
Confluent Platform
The Enterprise Distribution of
Apache Kafka
Confluent Cloud
Apache Kafka Re-engineered
for the Cloud
Self-Managed Software
Fully-Managed Service
VM
Deploy on any platform, on-prem or cloud
Available on the leading public clouds
©2022 Databricks Inc. — All rights reserved
From migration
to modernization
How Confluent and Databricks accelerate
your journey to real-time analytics in the
cloud
©2022 Databricks Inc. — All rights reserved
Modernize your data platform with Confluent + Databricks
Reduce total cost of ownership
Process data in stream to lower DW costs. Lower data
pipeline TCO with fully managed cloud service.
Power new analytics and apps
Link on-prem and cloud for easier data movement
across environments with real-time event streaming
to modernize your analytics and applications
Get more data to and from your DW
Break data silos and easily connect your data
warehouse to popular sources and sinks using
Confluent’s 120+ pre-built connectors
= Real-time connections & streams
©2022 Databricks Inc. — All rights reserved
Managed Connectors
(for near real time data delivery)
● Bring Data from everywhere to everywhere
● Multi-Cloud Native
● Open Source Kafka - hardened for Enterprise
● Single platform for BI, AI, ML on all data
● Multi-Cloud Native
● Open source (Spark, MLFlow, Delta Lake)
Multiple
Integrations
Spark Structured
Streaming
(for real time data delivery)
Best in class solution for real-time analytics
Delivered at scale with the speed, security, and reliability required by enterprises
©2022 Databricks Inc. — All rights reserved
Next steps
Get started
Try Databricks for free
databricks.com/try-databricks
Visit the Databricks booth
Databricks
Get started
Try Confluent for free with $400
in free credits
confluent.io/confluent-cloud
Visit the Confluent booth
Confluent
©2022 Databricks Inc. — All rights reserved
Q&A
©2022 Databricks Inc. — All rights reserved
©2022 Databricks Inc. — All rights reserved
Thank you!
47

Mais conteúdo relacionado

Mais procurados

Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
Jeffrey T. Pollock
 
Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data Virtualization
Denodo
 

Mais procurados (20)

Databricks Fundamentals
Databricks FundamentalsDatabricks Fundamentals
Databricks Fundamentals
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta Lake
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architecture
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?
 
Azure+Databricks+Course+Slide+Deck+V4.pdf
Azure+Databricks+Course+Slide+Deck+V4.pdfAzure+Databricks+Course+Slide+Deck+V4.pdf
Azure+Databricks+Course+Slide+Deck+V4.pdf
 
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
 
Building Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics PrimerBuilding Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics Primer
 
Big Data Architecture and Design Patterns
Big Data Architecture and Design PatternsBig Data Architecture and Design Patterns
Big Data Architecture and Design Patterns
 
Databricks for Dummies
Databricks for DummiesDatabricks for Dummies
Databricks for Dummies
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture Design
 
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)
 
Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data Virtualization
 
The delta architecture
The delta architectureThe delta architecture
The delta architecture
 
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse Architecture
 
Databricks: A Tool That Empowers You To Do More With Data
Databricks: A Tool That Empowers You To Do More With DataDatabricks: A Tool That Empowers You To Do More With Data
Databricks: A Tool That Empowers You To Do More With Data
 

Semelhante a Apache Kafka With Spark Structured Streaming With Emma Liu, Nitin Saksena, Ram Dhakne | Current 2022

Data Integration for Both Self-Service Analytics and IT Users
Data Integration for Both Self-Service Analytics and IT Users Data Integration for Both Self-Service Analytics and IT Users
Data Integration for Both Self-Service Analytics and IT Users
Senturus
 
Liberate Legacy Data Sources with Precisely and Databricks
Liberate Legacy Data Sources with Precisely and DatabricksLiberate Legacy Data Sources with Precisely and Databricks
Liberate Legacy Data Sources with Precisely and Databricks
Precisely
 
Future of Power: Power Strategy and Offerings for Denmark - Steve Sibley
Future of Power: Power Strategy and Offerings for Denmark - Steve SibleyFuture of Power: Power Strategy and Offerings for Denmark - Steve Sibley
Future of Power: Power Strategy and Offerings for Denmark - Steve Sibley
IBM Danmark
 
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
HostedbyConfluent
 
¿Cómo las manufacturas están evolucionando hacia la Industria 4.0 con la virt...
¿Cómo las manufacturas están evolucionando hacia la Industria 4.0 con la virt...¿Cómo las manufacturas están evolucionando hacia la Industria 4.0 con la virt...
¿Cómo las manufacturas están evolucionando hacia la Industria 4.0 con la virt...
Denodo
 
By Thoughtworks | Building data as a product: The key to unlocking Data Mesh'...
By Thoughtworks | Building data as a product: The key to unlocking Data Mesh'...By Thoughtworks | Building data as a product: The key to unlocking Data Mesh'...
By Thoughtworks | Building data as a product: The key to unlocking Data Mesh'...
IngridBuenaventura
 
Accelerate Digital Transformation with Data Virtualization in Banking, Financ...
Accelerate Digital Transformation with Data Virtualization in Banking, Financ...Accelerate Digital Transformation with Data Virtualization in Banking, Financ...
Accelerate Digital Transformation with Data Virtualization in Banking, Financ...
Denodo
 

Semelhante a Apache Kafka With Spark Structured Streaming With Emma Liu, Nitin Saksena, Ram Dhakne | Current 2022 (20)

Digital Business Transformation in the Streaming Era
Digital Business Transformation in the Streaming EraDigital Business Transformation in the Streaming Era
Digital Business Transformation in the Streaming Era
 
451 Research + NuoDB: What It Means to be a Container-Native SQL Database
451 Research + NuoDB: What It Means to be a Container-Native SQL Database451 Research + NuoDB: What It Means to be a Container-Native SQL Database
451 Research + NuoDB: What It Means to be a Container-Native SQL Database
 
Data Integration for Both Self-Service Analytics and IT Users
Data Integration for Both Self-Service Analytics and IT Users Data Integration for Both Self-Service Analytics and IT Users
Data Integration for Both Self-Service Analytics and IT Users
 
Turning Big Data into Better Business Outcomes
Turning Big Data into Better Business OutcomesTurning Big Data into Better Business Outcomes
Turning Big Data into Better Business Outcomes
 
Liberate Legacy Data Sources with Precisely and Databricks
Liberate Legacy Data Sources with Precisely and DatabricksLiberate Legacy Data Sources with Precisely and Databricks
Liberate Legacy Data Sources with Precisely and Databricks
 
The Evolution of Data Architecture
The Evolution of Data ArchitectureThe Evolution of Data Architecture
The Evolution of Data Architecture
 
Data Con LA 2022 - Modern Data Strategy
Data Con LA 2022 - Modern Data StrategyData Con LA 2022 - Modern Data Strategy
Data Con LA 2022 - Modern Data Strategy
 
Future of Power: Power Strategy and Offerings for Denmark - Steve Sibley
Future of Power: Power Strategy and Offerings for Denmark - Steve SibleyFuture of Power: Power Strategy and Offerings for Denmark - Steve Sibley
Future of Power: Power Strategy and Offerings for Denmark - Steve Sibley
 
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, ClouderaMongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
 
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
 
Powering the Internet of Things with Apache Hadoop
Powering the Internet of Things with Apache HadoopPowering the Internet of Things with Apache Hadoop
Powering the Internet of Things with Apache Hadoop
 
Consumption based analytics enabled by Data Virtualization
Consumption based analytics enabled by Data VirtualizationConsumption based analytics enabled by Data Virtualization
Consumption based analytics enabled by Data Virtualization
 
¿Cómo las manufacturas están evolucionando hacia la Industria 4.0 con la virt...
¿Cómo las manufacturas están evolucionando hacia la Industria 4.0 con la virt...¿Cómo las manufacturas están evolucionando hacia la Industria 4.0 con la virt...
¿Cómo las manufacturas están evolucionando hacia la Industria 4.0 con la virt...
 
Cloud Scale Analytics Pitch Deck
Cloud Scale Analytics Pitch DeckCloud Scale Analytics Pitch Deck
Cloud Scale Analytics Pitch Deck
 
By Thoughtworks | Building data as a product: The key to unlocking Data Mesh'...
By Thoughtworks | Building data as a product: The key to unlocking Data Mesh'...By Thoughtworks | Building data as a product: The key to unlocking Data Mesh'...
By Thoughtworks | Building data as a product: The key to unlocking Data Mesh'...
 
Cloudera + Syncsort: Fuel Business Insights, Analytics, and Next Generation T...
Cloudera + Syncsort: Fuel Business Insights, Analytics, and Next Generation T...Cloudera + Syncsort: Fuel Business Insights, Analytics, and Next Generation T...
Cloudera + Syncsort: Fuel Business Insights, Analytics, and Next Generation T...
 
Accelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and VisualizationAccelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and Visualization
 
Turning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data PlatformTurning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data Platform
 
Accelerate Digital Transformation with Data Virtualization in Banking, Financ...
Accelerate Digital Transformation with Data Virtualization in Banking, Financ...Accelerate Digital Transformation with Data Virtualization in Banking, Financ...
Accelerate Digital Transformation with Data Virtualization in Banking, Financ...
 
Why Data Virtualization? An Introduction.
Why Data Virtualization? An Introduction.Why Data Virtualization? An Introduction.
Why Data Virtualization? An Introduction.
 

Mais de HostedbyConfluent

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
HostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
HostedbyConfluent
 

Mais de HostedbyConfluent (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
 

Último

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 

Apache Kafka With Spark Structured Streaming With Emma Liu, Nitin Saksena, Ram Dhakne | Current 2022

  • 1. ©2022 Databricks Inc. — All rights reserved Kafka with Spark Structured Streaming and Beyond: Building Real-Time Data Processing and Analytics with Databricks Emma Liu Staff Product Manager, Databricks Nitin Saksena Sr. Director of Omni Channel Architecture, Albertsons Companies Ram Dhakne Staff Solutions Engineer, Confluent
  • 2. ©2022 Databricks Inc. — All rights reserved Databricks The Lakehouse Company Global adoption Over 7000 customers, from F500 to unicorns Inventor and pioneer of the data lakehouse Gartner recognized leader in both ● Database Management Systems ● Data Science and Machine Learning Platforms Creator of highly successful OSS data projects: Delta Lake, Apache Spark, and MLflow Raised over $3B in investment 3000+ employees across the globe
  • 3. ©2022 Databricks Inc. — All rights reserved 3 Simple Unify your data warehousing and AI use cases on a single platform Multicloud One consistent data platform across clouds Open Built on open source and open standards Databricks Lakehouse Platform Lakehouse Platform Data Warehousing Data Engineering Data Science and ML Data Streaming All structured and unstructured data Cloud Data Lake Unity Catalog Fine-grained governance for data and AI Delta Lake Data reliability and performance
  • 4. ©2022 Databricks Inc. — All rights reserved Data Governance Data Warehousing Data Engineering Data Science and ML Data Streaming BI and Dashboards Machine Learning Data Science Consulting & SI Partners Databricks thrives within your modern data stack Data Pipelines Unity Catalog Delta Lake Cloud Data Lake Data Ingestion
  • 5. ©2022 Databricks Inc. — All rights reserved Speaker: Emma Liu Benyue (Emma) Liu Staff Product Manager – Data Ingestion @ Databricks Previous Experiences: Product Manager at TigerGraph, MarkLogic; Software Engineer at Oracle Domain Focus: Cloud Computing, DBaaS, Multi-Model & Graph Databases, Developer/DBA Tools, Containerization MS in Technology and Policy Program at MIT BS in General Engineering at Harvey Mudd College
  • 6. ©2022 Databricks Inc. — All rights reserved Spark Structured Streaming Project Lightspeed Topic All structured and unstructured data Cloud Data Lake Unified Governance and Security Data Reliability and Performance Lakehouse Platform Data Warehousing Data Engineering Data Science and ML Data Streaming Developer Experience Kafka and Spark Structured Streaming
  • 7. ©2022 Databricks Inc. — All rights reserved Continuously generated and unbounded data Machine & Application Logs Clickstreams Mobile & IoT Data DB Change Data Feeds Application Events Streaming Data The vast majority of the data in the world is streaming data! 7
  • 8. ©2022 Databricks Inc. — All rights reserved Stream Processing is continuous and unbounded Stream Processing 8 Traditional Processing is one-off and bounded 1 Data Source 2 Processing Data Source Processing
  • 9. ©2022 Databricks Inc. — All rights reserved Stream Processing in the Lakehouse Databases Streaming Sources Cloud Object Stores SaaS Applications NoSQL On-premises Systems Data Sources BI Reporting Dashboarding Data Science Machine Learning Operational Apps Data Consumers Transform and improve quality Stream/batch processing Spark, Structured Streaming (Photon) Ingest Auto Loader Serve / Share Orchestrate all processes Monitor & Govern Delta Live Tables Jobs
  • 10. ©2022 Databricks Inc. — All rights reserved Technical Advantages A more intuitive way of capturing and processing continuous and unbounded data Lower latency for time sensitive applications and use cases Better fault-tolerance through checkpointing Higher compute utilization and scalability through continuous and incremental processing 10
  • 11. ©2022 Databricks Inc. — All rights reserved Business Benefits 11 BI and SQL Analytics Fresher and faster insights Quicker and better business decisions Data Engineering Sooner availability of cleaned data More business use cases Data Science and ML More frequent model update and inference Better model efficacy Event Driven Application Faster customized response and action Better and differentiated customer experience
  • 12. ©2022 Databricks Inc. — All rights reserved 12 Project Lightspeed
  • 13. ©2022 Databricks Inc. — All rights reserved Project Lightspeed Faster and simpler stream processing Predictable Low Latency Target reduction in tail latency by up to 2x Enhanced Functionality Advanced capabilities for processing data with new operators and easy to use APIs Operations & Troubleshooting Simplifying deployment, operations, monitoring, and troubleshooting Connectors & Ecosystem Improving ecosystem support for connectors, authentication & authorization features 13
  • 14. ©2022 Databricks Inc. — All rights reserved 14 Apache Kafka & Apache Spark Structured Streaming
  • 15. ©2022 Databricks Inc. — All rights reserved Apache Kafka and Apache Spark Structured Streaming Apache Kafka Connector for Spark Structured Streaming Structured Streaming End-to-end Open Source Pipeline 15
  • 16. ©2022 Databricks Inc. — All rights reserved Apache Kafka Connector for Spark Structured Streaming Apache Kafka Connector for Spark Structured Streaming Structured Streaming
  • 17. ©2022 Databricks Inc. — All rights reserved Apache Kafka and Databricks Auto Loader Auto Loader Cloud Storage (S3, ADLS, GCS) Structured Streaming
  • 18. ©2022 Databricks Inc. — All rights reserved Confluent Cloud - Databricks Delta Lake Sink Connector Cloud Storage Delta Lake Sink Connector
  • 19. ©2022 Databricks Inc. — All rights reserved Latency Convenience Cost Choose the right latency, convenience, and cost tradeoff for each specific use case 19
  • 20. ©2022 Databricks Inc. — All rights reserved 20 Supercharge Lakehouse with Streaming Data Databricks and Confluent
  • 21. ©2022 Databricks Inc. — All rights reserved 21 Albertsons
  • 22. Building Real-Time Data Processing and Real time Analytics Nitin Saksena Senior Director of Omni Channel Architecture, Albertsons Companies
  • 23. ALBERTSONS COMPANIES 23 Albertsons Companies is one of the largest food and drug retailers in the country. Locally great, Nationally strong
  • 24. ALBERTSONS COMPANIES 24 Our Purpose To bring people together around the joys of food and to inspire well-being
  • 25. ALBERTSONS COMPANIES 25 Albertsons – Locally great, Nationally strong Environment, Social and Governance (ESG) efforts are focused on: Planet, People, Product and Community
  • 26. ALBERTSONS COMPANIES 26 Nitin Saksena Senior Director of Omni Channel Architecture, Albertsons Nitin is leading the Enterprise Architecture Team across eCommerce, Digital Shopping Experience, Marketing and Media Collective, Merchandizing and Pharmacy. He has 19+ years of experience in industry predominantly in Retail. Nitin has architected several key initiatives in Loyalty, Offer Execution and Redemption, Order Management and Customer Support. His areas of interest is to solve complex business problems with simple solutions.
  • 28. ALBERTSONS COMPANIES 28 Technology Transformation - Key Tenets Strengthen talent with an agile and high energy Global Team High Performance Team Build easy frictionless & differentiated Omni- Channel Experiences Omni-Channel Experiences Create agile, scalable and reliable technology with Platform Modernization Modern Technology Platform Maximize value for our customers and employees with intelligent data Intelligent Data Leverage technology to drive productivity in the enterprise Productivity Strong Technology Foundation to Support Accelerated Business Growth
  • 29. ALBERTSONS COMPANIES 29 Technology Strategic Priorities DRIVERS Intelligent Agility Cognitive Realtime Connected Autonomous Foundation Modernization & Migration to Public Cloud Enterprise Data Platform Network Modernization InfoSecurity Global People & Process Optimization Competitive Digital & eCommerce Optimized Promotions Forecasting & Replenishment Run Great Stores Smart Stores Supply Chain of the Future Scaled & Differentiated eCommerce Strategic Data Health & Wellness Expansion Loyalty Programs Business Priorities EASY & FRICTIONLESS EXCITING & INNOVATIVE DIGITAL PENETRATION LONG TERM CUSTOMER RELATIONSHIPS
  • 31. ALBERTSONS COMPANIES 31 Real Time Data Processing using Databricks Distributing Offers and Customer Clips to each store in Near Real Time Distribution of Offers to the relevant Stores Distribution of Clips in near real time to the frequently shopped store. An improved Store checkout experience. Real Time Business Reporting and Dashboard Supply Chain Order Forecast Warehouse Order Management and Delivery Labor Forecasting and Management Demand planning Inventory management Inventory updates from 2200 stores to the cloud services in near real time Maintain a visibility in the health of the Stores Well Managed Out of Stock and Substitution recommendations for Digital Customers Ingesting Transactions in Near Real Time to Data Hub Generate Near Real Time Dashboards for Associates and Business Feed Data in Near Real time to Data Models for in-session Hyper Personalization
  • 32. ALBERTSONS COMPANIES 32 Real Time Data Processing Using Databricks Real Time Data Processing using Databricks in an Event Driven Architecture Problem Statement • Merchandising Team creates Offers and Promotions. • The Offers are created at Banner, Division or Individual Store Levels • These Offers are to be presented to our customers. • More importantly these offers are sent to the stores across the country. • Customers view these offers on digital channel and clip them. • These Clips are sent to frequently shopped stores in real time. • Clipped Offers are applied on the transactions online and in stores • Stores and Online system are notified of the redeemed offers Stores Offers Clips Offers
  • 33. ALBERTSONS COMPANIES 33 Real Time Data Processing using Databricks Offer Management System API API Master Data Stores Offer Expansion and Distribution Clips Distribution to relevant store
  • 34. ALBERTSONS COMPANIES 34 Real Time Analytics Using Databricks API Operational Database Event Database Extract Metadata out of Business Events Cross Reference the Metadata with other Event Metadata Drive the Metrics out of Event Cross References Derived Metrics Dashboard Application
  • 35. ALBERTSONS COMPANIES 35 Future Use Case with Real Time Data Processing Hyper Personalization of Content (Offers, Recipes) In session recommendation using runtime data models Next Best Action Reduced Out of Stock Pick and Improve Substitution By using real time inventory signals
  • 36. ©2022 Databricks Inc. — All rights reserved 36 Databricks & Confluent Partnership
  • 37. ©2022 Databricks Inc. — All rights reserved Speaker: Ram Dhakne Ram Dhakne Ram works as a Staff Solutions Engineer at Confluent. He has a wide array of experience in NoSQL databases, Filesystems, Distributed Systems and Apache Kafka. He supports various industry verticals ranging from large financial services, retail, healthcare, telecom, and utilities companies towards their digital modernization journey. His current interests are in helping customers adopt realtime event streaming technologies using Kafka. As a part-time hobby, he has authored two children’s books.
  • 38. ©2022 Databricks Inc. — All rights reserved Modern data platforms are powering multiple real-time apps across industries Retail Drive consumer analytics & streamline operations Healthcare Provide patients better choices & doctors better insight Banking Combat fraud & remain competitive Automotive Amplify vehicle intelligence & safety Inventory Management Personalized Promotions Product Development & Introduction Sentiment Analysis Supply chain logistics Systems of Scale for High Traffic Periods Connected Health Records Data Confidentiality & Accessibility Dynamic Staff Allocation Optimization Integrated Treatment Proactive Patient Care Real-Time Monitoring Capital Management Early-On Fraud Detection Market Risk Recognition & Investigation Preventive Regulatory Scanning Real-Time What-If Analysis Trade Flow Monitoring Advanced Navigation Environmental Factor Processing Fleet Management Predictive Maintenance Threat Detection & Real-Time Response Traffic Distribution Optimization
  • 39. ©2022 Databricks Inc. — All rights reserved
  • 40. ©2022 Databricks Inc. — All rights reserved Multiple deployment models Confluent Platform The Enterprise Distribution of Apache Kafka Confluent Cloud Apache Kafka Re-engineered for the Cloud Self-Managed Software Fully-Managed Service VM Deploy on any platform, on-prem or cloud Available on the leading public clouds
  • 41. ©2022 Databricks Inc. — All rights reserved From migration to modernization How Confluent and Databricks accelerate your journey to real-time analytics in the cloud
  • 42. ©2022 Databricks Inc. — All rights reserved Modernize your data platform with Confluent + Databricks Reduce total cost of ownership Process data in stream to lower DW costs. Lower data pipeline TCO with fully managed cloud service. Power new analytics and apps Link on-prem and cloud for easier data movement across environments with real-time event streaming to modernize your analytics and applications Get more data to and from your DW Break data silos and easily connect your data warehouse to popular sources and sinks using Confluent’s 120+ pre-built connectors = Real-time connections & streams
  • 43. ©2022 Databricks Inc. — All rights reserved Managed Connectors (for near real time data delivery) ● Bring Data from everywhere to everywhere ● Multi-Cloud Native ● Open Source Kafka - hardened for Enterprise ● Single platform for BI, AI, ML on all data ● Multi-Cloud Native ● Open source (Spark, MLFlow, Delta Lake) Multiple Integrations Spark Structured Streaming (for real time data delivery) Best in class solution for real-time analytics Delivered at scale with the speed, security, and reliability required by enterprises
  • 44. ©2022 Databricks Inc. — All rights reserved Next steps Get started Try Databricks for free databricks.com/try-databricks Visit the Databricks booth Databricks Get started Try Confluent for free with $400 in free credits confluent.io/confluent-cloud Visit the Confluent booth Confluent
  • 45. ©2022 Databricks Inc. — All rights reserved Q&A
  • 46. ©2022 Databricks Inc. — All rights reserved
  • 47. ©2022 Databricks Inc. — All rights reserved Thank you! 47