SlideShare a Scribd company logo
1 of 13
Download to read offline
Designing a Feedback Loop for
Event-Driven Data Sharing
Enabled with
Kafka Connect and KSQL
Presented by: Teresa Wang
weijiuan.t.wang@jpl.nasa.gov
Enterprise Business Information Services Division (EBIS)
Jet Propulsion Laboratory
Oct. 4th 2022
10/4/2022 2
Agenda
• Use Case
• Challenges
• Design Principles
• Solution Design
• Summary
Disclaimer
Reference herein to any specific commercial product, process, or
service by trade name, trademark, manufacturer, or otherwise, does
not constitute or imply its endorsement by the United States
Government or the Jet Propulsion Laboratory, California Institute of
Technology
10/4/2022 3
• Handles complex data dependencies across tables and respecting business and atomicity requirements between the source and target systems
• Establishes a robust feedback loop to properly notify sending system about successful deliveries and to identify/remediate errors
10/4/2022 4
Tasks: Design a fully automated data transport system that …
Earned Value Management System
Scope
Budgets
Schedule
Challenges
• The triggering event is recorded in a staging table and data related to the triggered event are
located in 9 separate tables with Foreign-Key relationship
• Some source tables have a large data structure (e.g. > 120 data elements)
– Cannot use multi-joins in a single source connector for query-based polling for all involved source tables
– Certain data elements require data type recasting
• Determining the event-based data transport has completed and received by the destination
– Each source table contains various amount of data; may be as few as dozens, may be more than 100,000 rows
– How to determine if an error occurred during data transport?
• Triggering downstream processing in source and destination systems upon event completion
10/4/2022 5
Design Principles
• Maximizing the usage of Kafka Components
– To ensure data consistency and integrity
– Extend/Augment as necessary
• Separation-of-Concerns
– Data Transport Abstraction means …
• No need for software engineers to write Kafka producers or consumers
• Kafka can do SMT for label translation for business domain-specific terms
• Maintainability & Reusability
– Prefer configuration over coding for change requests
– Future-proof with enabling data pipeline extensions in mind
– Framework is reusable for other use cases
10/4/2022 6
10/4/2022 7
Solution Design Components
KSQL
Streaming,
KTables
Control -
Manifest
W
iretapping
JDBC
Java
Microservices
Confluent
Connectors
9
Source
connectors
Event table
9 data tables
9 Topics
9
Sink
connectors
The Basis: A Typical Data Pipeline Design with Source/Sink Connectors
? Event status ?
9 data tables
? Event status ?
Kafka
EVM sys.
Visualized Data Pipeline Design feedback flows:
10/4/2022 9
Topic
PROGRAM
Topic
WIRETAPS
Topic
PENDING_ALERT
S
KSQL-Stream
PROGRAM_S
KSQL-Table
PROGRAM_T
KSQL-Table/Topic
EVENT_COMPLETION_T
KSQL-Stream
WIRETAPS_S
KSQL-Table
WIRETAPS_T
KSQL-Table/Topic
ALERTS_T
KSQL-Stream
PENDING_ALERTS_S
Source
connectors
Wiretapped
Sink
connectors
Sink
connectors
Watcher microservice
Notifications
microservice
Other data
Topics
pivoted
(compares running totals in
WIRETAPS_T with control totals on manifest in PROGRAM_T)
Summary – An Enterprise Integration Design Pattern
10/4/2022 10
Supervisory Structure
• Acquires from the source a manifest for each event
• Reports unmet expectations with a continuously running “watcher” microservice
• Keep producing “pending” alerts until an event is either completed or erred
Wiretapped
Sink Connectors
• Captures the #-messages written to destination during sink connectors deliveries
KSQL Aggregations
• for event-completion declaration when expectations are met
• for declaring an “alert” when the “#-pending alerts” exceeds a preset threshold
Producing Feedbacks
• Consuming from both event-completions (Successes) & event-errors (Alerts) topics
• Sent to both source and target systems to trigger further processing
Summary – Why We Like This Solution Design
• Data transport abstraction: Software Engineers don’t need to code for
– Kafka producers or consumers
– Web services for data exchange validation with the target system
• Leveraging KSQL aggregations for
– Declaration on an event “completion”, or
– Declaration on an event “alert”
• A minimalist approach – Less (coding by data engineers) is More!
– Extending JDBC to enable wiretapping on sink connectors based on configurable attributes
– Single purpose “Watcher” microservice to produce “pending alerts”
– Employing an existing “notification” microservice for multi-channel broadcasting about ”alerts”
10/4/2022 11
Acknowledgements
Jacob Nowicki for introducing Kafka to the EBIS Division at JPL !!
Peter Grzegorczyk for his innovative engineering collaboration !!
10/4/2022 12
jpl.nasa.gov
10/4/2022 13

More Related Content

Similar to Designing a Feedback Loop for Event-Driven Data Sharing With Teresa Wang | Current 2022

Opal: Simple Web Services Wrappers for Scientific Applications
Opal: Simple Web Services Wrappers for Scientific ApplicationsOpal: Simple Web Services Wrappers for Scientific Applications
Opal: Simple Web Services Wrappers for Scientific ApplicationsSriram Krishnan
 
Building cloud native data microservice
Building cloud native data microserviceBuilding cloud native data microservice
Building cloud native data microserviceNilanjan Roy
 
Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex Apache Apex
 
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...Dataconomy Media
 
Introducing Events and Stream Processing into Nationwide Building Society (Ro...
Introducing Events and Stream Processing into Nationwide Building Society (Ro...Introducing Events and Stream Processing into Nationwide Building Society (Ro...
Introducing Events and Stream Processing into Nationwide Building Society (Ro...confluent
 
APAC Kafka Summit - Best Of
APAC Kafka Summit - Best Of APAC Kafka Summit - Best Of
APAC Kafka Summit - Best Of confluent
 
Who's in your Cloud? Cloud State Monitoring
Who's in your Cloud? Cloud State MonitoringWho's in your Cloud? Cloud State Monitoring
Who's in your Cloud? Cloud State MonitoringKevin Hakanson
 
Estimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformEstimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformDATAVERSITY
 
Optimising Service Deployment and Infrastructure Resource Configuration
Optimising Service Deployment and Infrastructure Resource ConfigurationOptimising Service Deployment and Infrastructure Resource Configuration
Optimising Service Deployment and Infrastructure Resource ConfigurationRECAP Project
 
From Kafka to BigQuery - Strata Singapore
From Kafka to BigQuery - Strata SingaporeFrom Kafka to BigQuery - Strata Singapore
From Kafka to BigQuery - Strata SingaporeOfir Sharony
 
Databus - LinkedIn's Change Data Capture Pipeline
Databus - LinkedIn's Change Data Capture PipelineDatabus - LinkedIn's Change Data Capture Pipeline
Databus - LinkedIn's Change Data Capture PipelineSunil Nagaraj
 
Data integration with Apache Kafka
Data integration with Apache KafkaData integration with Apache Kafka
Data integration with Apache Kafkaconfluent
 
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with K...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with K...Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with K...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with K...confluent
 
Seattle spark-meetup-032317
Seattle spark-meetup-032317Seattle spark-meetup-032317
Seattle spark-meetup-032317Nan Zhu
 
Patterns & Practices of Microservices
Patterns & Practices of MicroservicesPatterns & Practices of Microservices
Patterns & Practices of MicroservicesWesley Reisz
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
 
8+ years of experience in Java/J2EE technologies
8+ years of experience in Java/J2EE technologies8+ years of experience in Java/J2EE technologies
8+ years of experience in Java/J2EE technologiessukanyanarra
 

Similar to Designing a Feedback Loop for Event-Driven Data Sharing With Teresa Wang | Current 2022 (20)

Opal: Simple Web Services Wrappers for Scientific Applications
Opal: Simple Web Services Wrappers for Scientific ApplicationsOpal: Simple Web Services Wrappers for Scientific Applications
Opal: Simple Web Services Wrappers for Scientific Applications
 
Building cloud native data microservice
Building cloud native data microserviceBuilding cloud native data microservice
Building cloud native data microservice
 
Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex
 
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
 
Introducing Events and Stream Processing into Nationwide Building Society (Ro...
Introducing Events and Stream Processing into Nationwide Building Society (Ro...Introducing Events and Stream Processing into Nationwide Building Society (Ro...
Introducing Events and Stream Processing into Nationwide Building Society (Ro...
 
APAC Kafka Summit - Best Of
APAC Kafka Summit - Best Of APAC Kafka Summit - Best Of
APAC Kafka Summit - Best Of
 
GouthamLaveti
GouthamLavetiGouthamLaveti
GouthamLaveti
 
Who's in your Cloud? Cloud State Monitoring
Who's in your Cloud? Cloud State MonitoringWho's in your Cloud? Cloud State Monitoring
Who's in your Cloud? Cloud State Monitoring
 
rakesh_resume
rakesh_resumerakesh_resume
rakesh_resume
 
Estimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformEstimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics Platform
 
Optimising Service Deployment and Infrastructure Resource Configuration
Optimising Service Deployment and Infrastructure Resource ConfigurationOptimising Service Deployment and Infrastructure Resource Configuration
Optimising Service Deployment and Infrastructure Resource Configuration
 
Siva-CV
Siva-CVSiva-CV
Siva-CV
 
From Kafka to BigQuery - Strata Singapore
From Kafka to BigQuery - Strata SingaporeFrom Kafka to BigQuery - Strata Singapore
From Kafka to BigQuery - Strata Singapore
 
Databus - LinkedIn's Change Data Capture Pipeline
Databus - LinkedIn's Change Data Capture PipelineDatabus - LinkedIn's Change Data Capture Pipeline
Databus - LinkedIn's Change Data Capture Pipeline
 
Data integration with Apache Kafka
Data integration with Apache KafkaData integration with Apache Kafka
Data integration with Apache Kafka
 
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with K...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with K...Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with K...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with K...
 
Seattle spark-meetup-032317
Seattle spark-meetup-032317Seattle spark-meetup-032317
Seattle spark-meetup-032317
 
Patterns & Practices of Microservices
Patterns & Practices of MicroservicesPatterns & Practices of Microservices
Patterns & Practices of Microservices
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
 
8+ years of experience in Java/J2EE technologies
8+ years of experience in Java/J2EE technologies8+ years of experience in Java/J2EE technologies
8+ years of experience in Java/J2EE technologies
 

More from HostedbyConfluent

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonHostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolHostedbyConfluent
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesHostedbyConfluent
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaHostedbyConfluent
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonHostedbyConfluent
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonHostedbyConfluent
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyHostedbyConfluent
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...HostedbyConfluent
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...HostedbyConfluent
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersHostedbyConfluent
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformHostedbyConfluent
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubHostedbyConfluent
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonHostedbyConfluent
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLHostedbyConfluent
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceHostedbyConfluent
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondHostedbyConfluent
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsHostedbyConfluent
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemHostedbyConfluent
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksHostedbyConfluent
 

More from HostedbyConfluent (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
 

Recently uploaded

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Recently uploaded (20)

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Designing a Feedback Loop for Event-Driven Data Sharing With Teresa Wang | Current 2022

  • 1. Designing a Feedback Loop for Event-Driven Data Sharing Enabled with Kafka Connect and KSQL Presented by: Teresa Wang weijiuan.t.wang@jpl.nasa.gov Enterprise Business Information Services Division (EBIS) Jet Propulsion Laboratory Oct. 4th 2022
  • 2. 10/4/2022 2 Agenda • Use Case • Challenges • Design Principles • Solution Design • Summary
  • 3. Disclaimer Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise, does not constitute or imply its endorsement by the United States Government or the Jet Propulsion Laboratory, California Institute of Technology 10/4/2022 3
  • 4. • Handles complex data dependencies across tables and respecting business and atomicity requirements between the source and target systems • Establishes a robust feedback loop to properly notify sending system about successful deliveries and to identify/remediate errors 10/4/2022 4 Tasks: Design a fully automated data transport system that … Earned Value Management System Scope Budgets Schedule
  • 5. Challenges • The triggering event is recorded in a staging table and data related to the triggered event are located in 9 separate tables with Foreign-Key relationship • Some source tables have a large data structure (e.g. > 120 data elements) – Cannot use multi-joins in a single source connector for query-based polling for all involved source tables – Certain data elements require data type recasting • Determining the event-based data transport has completed and received by the destination – Each source table contains various amount of data; may be as few as dozens, may be more than 100,000 rows – How to determine if an error occurred during data transport? • Triggering downstream processing in source and destination systems upon event completion 10/4/2022 5
  • 6. Design Principles • Maximizing the usage of Kafka Components – To ensure data consistency and integrity – Extend/Augment as necessary • Separation-of-Concerns – Data Transport Abstraction means … • No need for software engineers to write Kafka producers or consumers • Kafka can do SMT for label translation for business domain-specific terms • Maintainability & Reusability – Prefer configuration over coding for change requests – Future-proof with enabling data pipeline extensions in mind – Framework is reusable for other use cases 10/4/2022 6
  • 7. 10/4/2022 7 Solution Design Components KSQL Streaming, KTables Control - Manifest W iretapping JDBC Java Microservices Confluent Connectors
  • 8. 9 Source connectors Event table 9 data tables 9 Topics 9 Sink connectors The Basis: A Typical Data Pipeline Design with Source/Sink Connectors ? Event status ? 9 data tables ? Event status ? Kafka EVM sys.
  • 9. Visualized Data Pipeline Design feedback flows: 10/4/2022 9 Topic PROGRAM Topic WIRETAPS Topic PENDING_ALERT S KSQL-Stream PROGRAM_S KSQL-Table PROGRAM_T KSQL-Table/Topic EVENT_COMPLETION_T KSQL-Stream WIRETAPS_S KSQL-Table WIRETAPS_T KSQL-Table/Topic ALERTS_T KSQL-Stream PENDING_ALERTS_S Source connectors Wiretapped Sink connectors Sink connectors Watcher microservice Notifications microservice Other data Topics pivoted (compares running totals in WIRETAPS_T with control totals on manifest in PROGRAM_T)
  • 10. Summary – An Enterprise Integration Design Pattern 10/4/2022 10 Supervisory Structure • Acquires from the source a manifest for each event • Reports unmet expectations with a continuously running “watcher” microservice • Keep producing “pending” alerts until an event is either completed or erred Wiretapped Sink Connectors • Captures the #-messages written to destination during sink connectors deliveries KSQL Aggregations • for event-completion declaration when expectations are met • for declaring an “alert” when the “#-pending alerts” exceeds a preset threshold Producing Feedbacks • Consuming from both event-completions (Successes) & event-errors (Alerts) topics • Sent to both source and target systems to trigger further processing
  • 11. Summary – Why We Like This Solution Design • Data transport abstraction: Software Engineers don’t need to code for – Kafka producers or consumers – Web services for data exchange validation with the target system • Leveraging KSQL aggregations for – Declaration on an event “completion”, or – Declaration on an event “alert” • A minimalist approach – Less (coding by data engineers) is More! – Extending JDBC to enable wiretapping on sink connectors based on configurable attributes – Single purpose “Watcher” microservice to produce “pending alerts” – Employing an existing “notification” microservice for multi-channel broadcasting about ”alerts” 10/4/2022 11
  • 12. Acknowledgements Jacob Nowicki for introducing Kafka to the EBIS Division at JPL !! Peter Grzegorczyk for his innovative engineering collaboration !! 10/4/2022 12