O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

The Top 5 Apache Kafka Use Cases and Architectures in 2022

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio

Confira estes a seguir

1 de 64 Anúncio

The Top 5 Apache Kafka Use Cases and Architectures in 2022

I see the following topics coming up more regularly in conversations with customers, prospects, and the broader Kafka community across the globe:

Kappa Architecture: Kappa goes mainstream to replace Lambda and Batch pipelines (that does not mean that there is no batch processing anymore). Examples: Kafka-powered Kappa architectures from Uber, Disney, Shopify, and Twitter.
Hyper-personalized Omnichannel: Retail and customer communication across online and offline channels becomes the new black, including context-specific upselling, recommendations, and location-based services. Examples: Omnichannel Retail and Customer 360 in Real-Time with Apache Kafka.
Multi-Cloud Deployments: Business units and IT infrastructures span across regions, continents, and cloud providers. Linking clusters for bi-directional replication of data in real-time becomes crucial for many business models. Examples: Global Kafka deployments.
Edge Analytics: Low latency requirements, cost efficiency, or security requirements enforce the deployment of (some) event streaming use cases at the far edge (i.e., outside a data center), for instance, for predictive maintenance and quality assurance on the shop floor level in smart factories. Examples: Edge analytics with Kafka.
Real-time Cybersecurity: Situational awareness and threat intelligence need to process massive data in real-time to defend against cyberattacks successfully. The many successful ransomware attacks across the globe in 2021 were a warning for most CIOs. Examples: Cybersecurity for situational awareness and threat intelligence in real-time.

I see the following topics coming up more regularly in conversations with customers, prospects, and the broader Kafka community across the globe:

Kappa Architecture: Kappa goes mainstream to replace Lambda and Batch pipelines (that does not mean that there is no batch processing anymore). Examples: Kafka-powered Kappa architectures from Uber, Disney, Shopify, and Twitter.
Hyper-personalized Omnichannel: Retail and customer communication across online and offline channels becomes the new black, including context-specific upselling, recommendations, and location-based services. Examples: Omnichannel Retail and Customer 360 in Real-Time with Apache Kafka.
Multi-Cloud Deployments: Business units and IT infrastructures span across regions, continents, and cloud providers. Linking clusters for bi-directional replication of data in real-time becomes crucial for many business models. Examples: Global Kafka deployments.
Edge Analytics: Low latency requirements, cost efficiency, or security requirements enforce the deployment of (some) event streaming use cases at the far edge (i.e., outside a data center), for instance, for predictive maintenance and quality assurance on the shop floor level in smart factories. Examples: Edge analytics with Kafka.
Real-time Cybersecurity: Situational awareness and threat intelligence need to process massive data in real-time to defend against cyberattacks successfully. The many successful ransomware attacks across the globe in 2021 were a warning for most CIOs. Examples: Cybersecurity for situational awareness and threat intelligence in real-time.

Anúncio
Anúncio

Mais Conteúdo rRelacionado

Diapositivos para si (20)

Semelhante a The Top 5 Apache Kafka Use Cases and Architectures in 2022 (20)

Anúncio

Mais de Kai Wähner (20)

Mais recentes (20)

Anúncio

The Top 5 Apache Kafka Use Cases and Architectures in 2022

  1. 1. The Top 5 Use Cases and Architectures for Data in Motion in 2022 Kappa Architecture, Omnichannel, Multi-Cloud, Edge Analytics, and Real-time Cybersecurity Kai Waehner Field CTO kai.waehner@confluent.io linkedin.com/in/kaiwaehner @KaiWaehner confluent.io kai-waehner.de
  2. 2. @KaiWaehner www.kai-waehner.de https://www.gartner.com/en/information- technology/insights/top-technology-trends
  3. 3. @KaiWaehner www.kai-waehner.de Cloud Machine Learning Mobile Data in Motion Rethink Decision Making Rethink User Experience Rethink Data Rethink Data Centers
  4. 4. @KaiWaehner www.kai-waehner.de Real-time Data in Motion beats Slow Data. Transportation Real-time sensor diagnostics Driver-rider match ETA updates Banking Fraud detection Trading, risk systems Mobile applications / customer experience Retail Real-time inventory Real-time POS reporting Personalization Entertainment Real-time recommendations Personalized news feed In-app purchases
  5. 5. @KaiWaehner www.kai-waehner.de This is a fundamental paradigm shift... 5 Infrastructure as code Data in motion as continuous streams of events Future of the datacenter Future of data Cloud Event Streaming
  6. 6. @KaiWaehner www.kai-waehner.de Apache Kafka is the Platform for Data in Motion MES ERP Sensors Mobile Customer 360 Real-time Alerting System Data warehouse Producers Consumers Streams and storage of real time events Stream processing apps Connectors Connectors Stream processing apps Supplier Alert Forecast Inventory Customer Order 6
  7. 7. @KaiWaehner www.kai-waehner.de The Top 5 Use Cases and Architectures for Data in Motion in 2022 1) The Kappa Architecture 2) Hyper-personalized Omnichannel 3) Multi-Cloud Deployments 4) Edge Analytics 5) Real-time Cybersecurity
  8. 8. @KaiWaehner www.kai-waehner.de The Top 5 Use Cases and Architectures for Data in Motion in 2022 1) The Kappa Architecture 2) Hyper-personalized Omnichannel 3) Multi-Cloud Deployments 4) Edge Analytics 5) Real-time Cybersecurity
  9. 9. @KaiWaehner www.kai-waehner.de Lambda Architecture Option 1: Unified serving layer 9 Data Source Real-Time Layer (Data Processing in Motion) Batch Layer (Data Processing at Rest) Serving Layer Real-Time App (Data Processing in Motion) Batch App (Data Processing at Rest) ms min/hr
  10. 10. @KaiWaehner www.kai-waehner.de 10 Data Source Real-Time Layer (Data Processing in Motion) Batch Layer (Data Processing at Rest) Real-time Query Mixed Query ms min/hr Speed View Batch View Batch Query Lambda Architecture Option 2: Separate serving layers
  11. 11. @KaiWaehner www.kai-waehner.de Concerns with the Lambda Architecture 11
  12. 12. @KaiWaehner www.kai-waehner.de 12 Data Source Real-Time Layer (Data Processing in Motion) Real-Time App (Data Processing in Motion) Storage Batch App (Data Processing at Rest) Storage ms min/hr Storage Kappa Architecture One pipeline for real-time and batch consumers
  13. 13. @KaiWaehner www.kai-waehner.de Tiered Storage for Kafka 13
  14. 14. @KaiWaehner www.kai-waehner.de Kappa @ Uber 14
  15. 15. @KaiWaehner www.kai-waehner.de Kappa @ Shopify 15 Kappa Building Blocks The Log (Kafka) Durability with Topic Compaction and Tiered Storage Consistency via Exactly-Once Semantics (EOS) Data Integration via Kafka Connect Elasticity via dynamic Kafka clusters Streaming Framework (Kafka Streams / Flink) Reliability and scalability Fault tolerance State management Sinks Update/Upsert for simplified design: RDBMS, NoSQL, Compacted Kafka Topics Append-only: Regular Kafka Topics, Time Series
  16. 16. @KaiWaehner www.kai-waehner.de Kappa @ Disney 16 www.kai-waehner.de | @KaiWaehner | Streaming Machine Learning without a Data Lake
  17. 17. @KaiWaehner www.kai-waehner.de Benefits of the Kappa Architecture The Kappa architecture leverages a single source of truth with a focus on simplicity in the enterprise architecture • Improve streaming to handle all the cases • One codebase that is always in synch • One set of infrastructure and technology • The heart of the infrastructure is real-time, scalable, and reliable • Improved data quality with guaranteed ordering and no mismatches • No need to re-architect for new use cases, just connect new consumers (real-time, near real-time, batch, RPC) • Kappa is NOT a free lunch – know the trade-offs and best practices 17
  18. 18. @KaiWaehner www.kai-waehner.de Kappa is NOT a free lunch 18
  19. 19. @KaiWaehner www.kai-waehner.de Kappa Concerns Solved • Data availability / retention  Compacted Topics, Tiered Storage • Data consistency and fault-tolerance  Exactly-once semantics, Multi-Region Clusters, Cluster Linking • Handling late-arriving data  State management in the streaming application, proper data sinks, replay with guaranteed ordering and timestamps • Data reprocessing and backfill  Dynamic clusters, stateful applications (Kafka Streams, ksqlDB, external stream processing framework like Apache Flink) • Data integration  Kafka Connect for sources and sinks, clients for any language, REST Proxy (real-time but also batch and RPC 19
  20. 20. @KaiWaehner www.kai-waehner.de The Top 5 Use Cases and Architectures for Data in Motion in 2022 1) The Kappa Architecture 2) Hyper-personalized Omnichannel 3) Multi-Cloud Deployments 4) Edge Analytics 5) Real-time Cybersecurity
  21. 21. @KaiWaehner www.kai-waehner.de The New Business Reality Technology is the business Innovation required for survival Yesterday’s data = failure Modern, real-time data infrastructure is required. Technology was a support function Innovation required for growth “Good enough” to run on yesterday’s data
  22. 22. @KaiWaehner www.kai-waehner.de Real-time automation of customer interactions Improved Shipping and Delivery Methods Customer-Driven In-Store Experiences Hybrid model Shopping Social Influencers / Virtual Reality Shopping: Journey-focused innovation General Trends: ● Highly competitive market, work to thin margins ● Moving from High Street (brick & mortar) to Online (Omni-Channel) ● Personalized Customer Experience - optimal buyer journey Customer Experience (CX) Operational Efficiencies New Business Models Disruptive Trends in Retail Warehouse logistics teams aligned with real-time, in-store demands Automating the supply chain and core business processes Data-Driven Business Decisions and Personalized Promotions
  23. 23. @KaiWaehner www.kai-waehner.de “Walmart is a $500 billion in revenue company, so every second is worth millions of dollars. Having Confluent as our partner has been invaluable. Kafka and Confluent are the backbone of our digital omnichannel transformation and success at Walmart.” VP of Walmart Cloud
  24. 24. @KaiWaehner www.kai-waehner.de Real-Time Inventory System https://www.confluent.io/blog/walmart-real-time-inventory-management-using-kafka/ https://www.confluent.io/kafka-summit-san-francisco-2019/when-kafka-meets-the-scaling-and-reliability-needs-of-worlds-largest-retailer-a-walmart-story/ ● Investment in Kafka and Confluent has helped topline company growth ● 8,500 nodes processing 11 billion events per day ● Deliver an omnichannel experience so every customer can shop the way they want to
  25. 25. @KaiWaehner www.kai-waehner.de Context-specific Customer 360 25 Electrical retailer Hyper-personalized online retail experience, turning each customer visit into a one-on-one marketing opportunity Correlation of historical customer data with real-time digital signals Maximize customer satisfaction and revenue growth, increased customer conversions https://www.confluent.io/customers/ao/
  26. 26. @KaiWaehner www.kai-waehner.de Dick’s Sporting Goods 26 America’s largest sporting goods retail company Focused on helping athletes achieve their personal best Reshape the way athletes gain access to context-specific product information in real time for a more seamless purchasing experience online and in stores Handle pricing and promotions, marketing, and athlete services in real time to ensure a consistent omnichannel experience and positive athlete service interaction Fully-managed multi-cloud strategy with Confluent Cloud for improved time-to-market and reduced operations cost. confluent.io/customers/dicks-sporting-goods
  27. 27. @KaiWaehner www.kai-waehner.de Omnichannel Retail Time P C3 C2 C1 Sales Talk on site in Car Dealership Right now Location-based Customer Action Customer 360 (Website, Mobile App, On Site in Store, In-Car) Car Configurator 10 and 8 days ago Context-specific Marketing Campaign 90 and 60 days ago
  28. 28. @KaiWaehner www.kai-waehner.de Live commerce with real-time data correlation including integration of CRM, loyalty, inventory, chatbots, location-based services, etc. 28
  29. 29. @KaiWaehner www.kai-waehner.de Omnichannel Retail Time P C3 C2 C1 Machine Learning Context-specific Recommendations Location-based Customer Action Customer 360 (Business Intelligence, Machine Learning) Machine Learning Train Recommendation Engine Reporting All Customer Interactions
  30. 30. @KaiWaehner www.kai-waehner.de The Top 5 Use Cases and Architectures for Data in Motion in 2022 1) The Kappa Architecture 2) Hyper-personalized Omnichannel 3) Multi-Cloud Deployments 4) Edge Analytics 5) Real-time Cybersecurity
  31. 31. @KaiWaehner www.kai-waehner.de Spaghetti: Data architectures are often complex 31
  32. 32. @KaiWaehner www.kai-waehner.de Kafka provides a solution: An immutable stream of facts with the freedom to act, adapt, and change 32 Kafka
  33. 33. @KaiWaehner www.kai-waehner.de Domain 33 Data Product Data Mesh: A new technology-agnostic decentralized implementation pattern Data Mesh Data ownership by domain Data as a product Data available anywhere, self serve Data governed wherever it is
  34. 34. @KaiWaehner www.kai-waehner.de 34 Mesh is one logical cluster. Data product has another. Data Product Data Product has its own cluster for internal use
  35. 35. @KaiWaehner www.kai-waehner.de 35 With stream processing the real-time applications are decentralized Data Product STREAM PROCESSOR ksqlDB Query is the interface to the mesh Events are the interface to the mesh
  36. 36. @KaiWaehner www.kai-waehner.de 36 Operational and Data Product Streaming Planes with Cluster Linking
  37. 37. @KaiWaehner www.kai-waehner.de Data Mesh Example: Hybrid Multi-Cloud Architecture 37 Data Engineers Data Scientists Data Architects Operators Architects SMEs Data Governance Shared Services Application team Generalist Eng Generalists Eng Specialized / Legacy Engineers
  38. 38. @KaiWaehner www.kai-waehner.de Kafka as a Service – Fully Managed? Infrastructure management (commodity) Scaling ● Upgrades (latest stable version of Kafka) ● Patching ● Maintenance ● Sizing (retention, latency, throughput, storage, etc.) ● Data balancing for optimal performance ● Performance tuning for real-time and latency requirements ● Fixing Kafka bugs ● Uptime monitoring and proactive remediation of issues ● Recovery support from data corruption ● Scaling the cluster as needed ● Data balancing the cluster as nodes are added ● Support for any Kafka issue with less than X minutes response time Infra-as-a-Service Harness full power of Kafka Kafka-specific management Platform-as-a-Service Evolve as you need Future-proof Mission-critical reliability Most Kafka-as-a-Service offerings are partially-managed Kafka as a Service should be a serverless experience with consumption-based pricing!
  39. 39. @KaiWaehner www.kai-waehner.de Data Governance: Tracking data lineage with Streams in real-time 39 • Lineage must work across domains and data products—and systems, clouds, data centers. • Event streaming is a foundational technology for this. On-premise
  40. 40. @KaiWaehner www.kai-waehner.de The Top 5 Use Cases and Architectures for Data in Motion in 2022 1) The Kappa Architecture 2) Hyper-personalized Omnichannel 3) Multi-Cloud Deployments 4) Edge Analytics 5) Real-time Cybersecurity
  41. 41. @KaiWaehner www.kai-waehner.de What is the “Edge” for Kafka? • Edge is NOT a data center • Kafka clients AND the Kafka broker(s) • Offline business continuity • Often 100+ locations • Low-footprint and low-touch • Hybrid integration
  42. 42. @KaiWaehner www.kai-waehner.de Edge Use Cases with Low Latency Requirements https://www.youtube.com/watch?v=A9DDe0alvGo
  43. 43. @KaiWaehner www.kai-waehner.de Low Latency 5G Use Cases for Edge and Hybrid Cloud with AWS Wavelength (based on AWS Outposts) and Confluent
  44. 44. @KaiWaehner www.kai-waehner.de CRM 3rd party payment provider Context-specific real-time upsell Customer data Payment processing and fraud detection as a service Manager Get report API Customer Customer Customer data Train schedule Payment data Loyalty information Streams of real time events Hybrid Retail Architecture
  45. 45. @KaiWaehner www.kai-waehner.de Point of Sale (POS) Loyalty System Local Inventory Management Payment Discount Customer data Train schedule Payment data Loyalty information Streams of real time events Global Inventory Management Event Streaming at the Edge in the Smart Retail Store Item Availability
  46. 46. @KaiWaehner www.kai-waehner.de Disconnected Edge Time P C3 C2 C1 Context-specific Advertisement Real-time (Milliseconds) Location-based Customer Action Always on (even “offline”) Replayability Reduced traffic cost Better latency Payment Processing Near Real-time (Seconds) Replication to Cloud Batch (Depending on Network Bandwidth)
  47. 47. @KaiWaehner www.kai-waehner.de Ship-Shore Highway – Swimming Retail Stores https://www.confluent.io/kafka-summit-lon19/seamless-guest-experience-with-kafka-streams/
  48. 48. @KaiWaehner www.kai-waehner.de Devon Energy Corporation Oil & Gas Industry Improve drilling and well completion operations Edge stream processing/analytics + closed-loop control ready Replication to the cloud in real-time at scale Vendor agnostic (pumping, wireline, coil, offset wells, drilling operations, producing wells Cloud agnostic (AWS, GCP, Azure)
  49. 49. @KaiWaehner www.kai-waehner.de The Top 5 Use Cases and Architectures for Data in Motion in 2022 1) The Kappa Architecture 2) Hyper-personalized Omnichannel 3) Multi-Cloud Deployments 4) Edge Analytics 5) Real-time Cybersecurity
  50. 50. @KaiWaehner www.kai-waehner.de What is Cybersecurity? Protection of computer systems and networks from information disclosure and theft Web Scraping, hackers, criminals, terrorists, state-sponsored and state-initiated actors 50
  51. 51. @KaiWaehner www.kai-waehner.de Supply Chain Attack Targeting less-secure elements in the supply chain 51 https://www.nortonrosefulbright.com/en/knowledge/publications/dfa3603c/six-degrees-of-separation-cyber-risk-across-global-supply-chains https://www.reuters.com/article/us-tmobile-dataprotection-idUSKCN0RV5PL20151002
  52. 52. @KaiWaehner www.kai-waehner.de Real-time Data in Motion beats Slow Data. Security Access control and encryption Regulatory compliance Rules engine Security monitoring Surveillance Cybersecurity Risk classification Threat detection Intrusion detection Incident response Fraud detection
  53. 53. @KaiWaehner www.kai-waehner.de Data in Motion The Backbone for Cybersecurity Industria l OT Enterpris e IT Consumer IoT Logs Personal Sensors Security Streams of real time events 53 Connected Vehicles Cyber Security Continuous Data Correlation Monitoring Alerting Proactive Actions
  54. 54. @KaiWaehner www.kai-waehner.de End-to-End Cybersecurity with the Kafka Ecosystem Personel Crew, Cargo Vessel Fuel Consumption, Speed, Planned Maintenance Tracking Position, Course, Weather, Draft Drone or Satellite Relay COMMs Resilient Kafka Edge Analytics Data Integration Streaming Analytics Machine Doing On-Prem Systems Bi-Directional Hybrid Cloud Replication ON SHORE ON PREM Staging, Filtering Shore Edge Analytics
  55. 55. @KaiWaehner www.kai-waehner.de SIEM / SOAR Situational Awareness Operational Awareness Intrusion Detection Signals and Noise Signature Detection Incident Response Threat Hunting & Intelligence Vulnerability Management Digital Forensics … was not built for cybersecurity!
  56. 56. @KaiWaehner www.kai-waehner.de Integrate with all legacy and modern interfaces Record, filter, curate a broad set of traffic streams Let analytic sinks consume just the right amount of data Drastically reduce the complexity of the enterprise architectures Drastically reduce the cost of SIEM / SOAR deployments Add new analytics engines Add stream-speed detection and response at scale in real-time Add mission-critical (non-) security-related applications … is the backbone for cybersecurity!
  57. 57. @KaiWaehner www.kai-waehner.de Confluent Sigma Sigma Stream Processors Zeek Data and Detections Viewer Sigma Rule Editor sigma rules topic DNS dns detections topic dns topic rule parsing, filtering, aggregation, windowing sigma rules cache CONN DHCP HTTP SSL x509 Zeek Data
  58. 58. @KaiWaehner www.kai-waehner.de Cyber Intelligence Platform leveraging Kafka Connect, Kafka Streams, Multi-Region Clusters (MRC), and more… https://www.intel.com/content/www/us/en/it-management/intel-it-best-practices/modern-scalable-cyber-intelligence-platform-kafka.html
  59. 59. @KaiWaehner www.kai-waehner.de How does Confluent help?
  60. 60. @KaiWaehner www.kai-waehner.de The Rise of Data in Motion 2010 Apache Kafka created at LinkedIn by Confluent founders 2014 2020 80% Fortune 100 Companies trust and use Apache Kafka 60
  61. 61. @KaiWaehner www.kai-waehner.de I N V E S T M E N T & T I M E V A L U E 3 4 5 1 2 Event Streaming Maturity Model Initial Awareness / Pilot (1 Kafka Cluster) Start to Build Pipeline / Deliver 1 New Outcome (1 Kafka Cluster) Mission-Critical Deployment (Stretched, Hybrid, Multi- Region) Build Contextual Event-Driven Apps (Stretched, Hybrid, Multi- Region) Central Nervous System (Global Kafka) Product, Support, Training, Partners, Technical Account Management... 61
  62. 62. @KaiWaehner www.kai-waehner.de Car Engine Car Self-driving Car Confluent completes Apache Kafka. Cloud-native. Everywhere.
  63. 63. @KaiWaehner www.kai-waehner.de Confluent... Complete. Cloud-native. Everywhere. Freedom of Choice Committer-driven Expertise Open Source | Community licensed Fully Managed Cloud Service Self-managed Software Training Partners Enterprise Support Professional Services ARCHITECT OPERATOR DEVELOPER EXECUTIVE Apache Kafka Dynamic Performance & Elasticity Self-Balancing Clusters | Tiered Storage Flexible DevOps Automation Operator | Ansible GUI-driven Mgmt & Monitoring Control Center | Proactive Support Event Streaming Database ksqlDB Rich Pre-built Ecosystem Connectors | Hub | Schema Registry Multi-language Development Non-Java Clients | REST Proxy Admin REST APIs Global Resilience Multi-Region Clusters | Replicator Cluster Linking Data Compatibility Schema Registry | Schema Validation Enterprise-grade Security RBAC | Secrets | Audit Logs TCO / ROI Revenue / Cost / Risk Impact Complete Engagement Model Efficient Operations at Scale Unrestricted Developer Productivity Production-stage Prerequisites Partnership for Business Success
  64. 64. Kai Waehner Field CTO kai.waehner@confluent.io @KaiWaehner kai-waehner.de confluent.io linkedin.com/in/kaiwaehner Questions? Feedback? Let’s connect!

Notas do Editor

  • I want to call out four major trends: (1) cloud, (2) AI and machine learning, (3) mobile devices and ubiquitous connectivity, (4) event streaming. Each of these trends change the way we think.
    1) The cloud has changed how we think about data centers and running technical infrastructure. Today, every company is moving to the cloud—your company is [quite likely] doing the same.
    2) Machine learning changes how decisions are being made, and this happens increasingly in an automated manner, driven by software that talks to other software.
    3) Mobile devices and Internet connectivity have dramatically changed the user experience of how customers want to interact with us, and raised the bar for their expectations. If you can rent the latest blockbuster movie with 1 click on an iPad, you will no longer accept that your bank can take hours or days to inform you of a payment.
    4) Event streaming has changed how we think about and how we work with the data that underlies all the other trends. This is the subject of this talk, so let’s take a closer look!
  • The same is true for running a business. No matter the industry, real-time data beats slow data. Here are a but a few examples, some of which you may recognize from your own use cases.
  • So Event Streaming is really a fundamental paradigm shift. Just like the Cloud is the future of the Data CENTER, where we now treat physical infrastructure as software code so we can spin up new servers in a matter of seconds, Event Streaming is the future of DATA itself. Here, we realize that, in the real world, data about our business is a continuous, never-ending stream of events, and customers expect us to understand and respond immediately to all this information. [NEXT SLIDE, “What is Event Streaming?”]
  • There is a new business reality. In the past, technology was a mere support function. We innovated when we needed to grow the business. And in this situation, it was “good enough” to run the business on yesterday’s data. But today, technology IS the business. And if you don’t innovate, you will lose to the competition. And in order to survive, we need modern, real-time data infrastructures.
  • Retail playbook: https://docs.google.com/document/d/1NlUJGvblMZ9bcyzyvdKKpdtaYBIKzsck5Cugq_sJP_U/edit#

  • Here is the story of Walmart, the largest retailer in the world. Walmart’s success is largely dependent on their digital capabilities. Let me share just a few numbers of what they need to integrate: 5000+ stores, 150+ distribution centers, 1000+ vendors, 53K+ trailers owned, 1M+ online transactions, 25M customers per week. Today, Kafka is used for Walmart’s real-time inventory systems, fulfillment, security, fraud prevention. It’s used all across Walmart.com: every single click is streamed into Kafka and made available to every application that needs to consume that data. Another example is Walmart’s grocery pick-up business, which has become more important than ever in the age of COVID. Event streaming enables this from the beginning to the end: when customers interact with their app, all the user behavioral data is streamed into Confluent. When orders are placed, all data flows into Confluent. When the customer enters the store to pick up their groceries, those events are streamed to Confluent. And so on. As we can see, event streaming and Kafka are at the heart of Walmart’s success and their digital transformation.
  • Small scale data pipelines constantly broken.

    Large scale: finance and risk have completely different numbers. Story of one path for books in an investment bank. Boose Allen Hamilton. 3 months analysis. 2 hours to explain.

  • This allows the applications to connect around data in motion
    Acts as a kind of central nervous system
    Let’s something happening in one part of the company, trigger the right updates and response everywhere else as it occurs
  • ...Event Streaming with Kafka. Here, data is provided to other data products through streams in Kafka. And any data product can consume via Kafka from the high-quality data streams of other data products. As we can see, this idea of a data mesh is very similar to the idea of a Central Nervous System, where data is continuously flowing, being processed, analyzed, acted upon. Now, we must remember that the data mesh shown here is a LOGICAL view, not a physical one. [OUTRO] If you know Kafka, you know that the reality looks a bit different and...a bit better.
  • ksqlDB turns the data mesh into something you can query, while still having all the benefits of being decentralized
  • A self-serve platform can have multiple planes that each serve a different profile of users. In the following example, lists three different data platform planes:
    Data infrastructure provisioning plane: supports the provisioning of the underlying infrastructure, required to run the components of a data product and the mesh of products. This includes provisioning of a distributed file storage, storage accounts, access control management system, the orchestration to run data products internal code, provisioning of a distributed query engine on a graph of data products, etc. I would expect that either other data platform planes or only advanced data product developers use this interface directly. This is a fairly low level data infrastructure lifecycle management plane.
    Data product developer experience plane: this is the main interface that a typical data product developer uses. This interface abstracts many of the complexities of what entails to support the workflow of a data product developer. It provides a higher level of abstraction than the 'provisioning plane'. It uses simple declarative interfaces to manage the lifecycle of a data product. It automatically implements the cross-cutting concerns that are defined as a set of standards and global conventions, applied to all data products and their interfaces.
    Data mesh supervision plane: there are a set of capabilities that are best provided at the mesh level - a graph of connected data products - globally. While the implementation of each of these interfaces might rely on individual data products capabilities, it’s more convenient to provide these capabilities at the level of the mesh. For example, ability to discover data products for a particular use case, is best provided by search or browsing the mesh of data products; or correlating multiple data products to create a higher order insight, is best provided through execution of a data semantic query that can operate across multiple data products on the mesh.



  • In this final example, we can see again that there are lots of data streams within a data mesh. These data streams may span across systems, data centers, clouds, and so on. For the purpose of tracking data lineage, we ideally want to cover the full mesh, so we must follow the data. Event streaming is again a key technology to implement this in practice, because it lets you track data-in-motion all the way from its origins to intermediate and to the final destinations.
  • The same is true for running a business. No matter the industry, real-time data beats slow data. Here are a but a few examples, some of which you may recognize from your own use cases.
  • The rise of Event Streaming can be traced back to 2010, when Apache Kafka was created by the future Confluent founders in Silicon Valley. From there, Kafka began spreading throughout Silicon Valley and across the US West coast. [CLICK] Then, in 2014, Confluent was created with the goal to turn Kafka into an enterprise-ready software stack and cloud offering, after which the adoption of Kafka started to really accelerate. [CLICK] Fast forward to 2020, tens of thousands of companies across the world and across all kinds of industries are using Kafka for event streaming.
    What I am telling my family and friends is: You are a Kafka user, whether you know it or not. When you use a smartphone, shop online, make a payment, read the news, listen to music, drive a car, book a flight—it’s very likely that this is powered by Kafka behind the scenes. Kafka is applied even to use cases that I personally would have never predicted, like by scientists for research on astrophysics, where Kafka is used for automatically coordinating globally-distributed, large telescopes to record interstellar phenomenons!
  • Know 5 stages and talking point for each one.



    There’s a common pattern of how organizations adopt this technology.

    First, there is initial awareness or a pilot, where an organization is getting to know the technology.

    This is followed by the initial development of a basic event pipeline, and the delivery at least 1 new business outcome - maybe provisioning a single source of truth for microservices, or offloading data from a mainframe.

    The third stage involves incorporating and leveraging stream processing. In this stage, an organization is not only collecting and transporting data in real-time, but also processing it for added value.

    The fourth stage is when an organization starts to build business-transforming contextual event-driven applications. This is a new category of applications - unique to event streaming - where real-time events can be combined with context to deliver powerful, profitable outcomes.

    The last stage is when event streaming is pervasive and becomes the central nervous system of the enterprise.

    Examples of this in the consumer world are Netflix and LinkedIn… and in the enterprise world are organizations like Capital One.

    Confluent accelerates the trajectory of customer journeys to event streaming through its products, support, training, our partner ecosystem and technical account management and services. Let’s talk about you - Where do you see your team on this journey today? How about your LOBs? Your company as a whole? Let’s talk for a few minutes about how we can get you where you need to go.
  • What we build is a full, enterprise-ready platform to complete open source Apache Kafka.

    On top of Kafka, we build a set of features to unleash developer productivity, including the ability to leverage Kafka in languages other than Java, a rich pre-built ecosystem including over 100+ connectors so developers don’t have to spend time building connectors themselves, and enabling stream processing with the ease and familiarity of SQL.

    Kafka can sometimes be complex and difficult to operate at scale… we make that easy through GUI-based management and monitoring, DevOps automation including with Kubernetes Operator, and enabling dynamic performance and elasticity in deploying Kafka.

    Also, we offer a set of features many organizations consider as pre-requisites when deploying mission-critical apps on Kafka. These include security features that control who has access to what, the ability to investigate potential security incidents via audit logs, the ability to ensure no ‘dirty’ data in Kafka, and that only ‘clean’ data is in the system through schema validation, and features around resilience, so for example if your data center goes down, your customer-facing applications stay running.

    We offer all of this with freedom of choice, meaning you can choose self-managed software that you can deploy anywhere, including on-premises, public cloud, private cloud, containers, or Kubernetes. Or you can choose our fully managed cloud service, available on all 3 major cloud providers.

    And, importantly, underpinning all this is our committer-led expertise. We at Confluent have over X hours of experience with Kafka. We offer support, professional services, training, and a full partner ecosystem. Simply put, there is no other organization in the world better suited to be an enterprise partner, and no organization in the world that is more capable of ensuring your success. This means everything to the organizations we work with.

×