Case-Study: Building Real-Time Applications at Scale-Cyclist Crash Detection with Tomas Neubauer

Quix Streams — Kafka Summit 2023 | 1
Quix Streams
Building Real-Time Applications at Scale
Quix Streams — Kafka Summit 2023 | 2
Tomas Neubauer
Previously McLaren technical lead
CTO & Co-founder, Quix
Hello, nice to meet you! 👋
Quix Streams — Kafka Summit 2023 | 3
Racing
background
Roots in real-time data processing in the
most extreme, time-critical environment.
● 50,000 channels per car
● 1.5 kHz per channel
● 1,000s realtime models and simulations
Quix Streams — Kafka Summit 2023 | 3
Quix Streams — Kafka Summit 2023 | 4
Now raise your hand if you are using…
Quix Streams — Kafka Summit 2023 | 5
Kafka
Now raise your hand if you are using…
Quix Streams — Kafka Summit 2023 | 6
Streaming
Now raise your hand if you are using…
Quix Streams — Kafka Summit 2023 | 7
Python
Now raise your hand if you are using…
Quix Streams — Kafka Summit 2023 | 8
Goal
Crash detection
phone-data crashes
Fitness
app
Quix Streams — Kafka Summit 2023 | 9
● Architecture
● ML deployment
● Streaming landscape
● How it works
● Demo - Let's build it
Content
Quix Streams — Kafka Summit 2023 | 10
ML Deployment
Quix Streams — Kafka Summit 2023 | 10
REST API vs Streaming
Quix Streams — Kafka Summit 2023 | 11
phone-data
Websocket
gateway
alerts
Websocket
gateway
ANALYSIS & TRAINING
Trained
model
SageMaker
Architecture
Crash detection
Quix Streams — Kafka Summit 2023 | 12
ML Deployment with API
API REQUEST
WEB API
API RESPONSE
gX gY gZ gTotal
0.5 0.3 0.1 0.9
gX gY gZ gTotal Crash
0.5 0.3 0.1 0.9 1
SERVICE
Quix Streams — Kafka Summit 2023 | 13
Issues with REST APIs
REST API vs Streaming
Quix Streams — Kafka Summit 2023 | 13
Quix Streams — Kafka Summit 2023 | 14
Problems with REST API
API REQUEST
gX gY gZ gTotal
● CPU overhead
● Introducing delay
● Requests gets lost in case of service downtime or slow performance
WEB API
SERVICE
Quix Streams — Kafka Summit 2023 | 15
Problems with REST API
gX gY gZ gTotal
WEB API
SERVICE
API REQUEST
Quix Streams — Kafka Summit 2023 | 16
Problems with REST API
gX gY gZ gTotal
WEB API
SERVICE
API REQUEST
Quix Streams — Kafka Summit 2023 | 17
Problems with REST API
API REQUEST
gX gY gZ gTotal
API REQUEST
gX gY gZ gTotal
WEB API
SERVICE
WEB API
SERVICE
Quix Streams — Kafka Summit 2023 | 18
Stream processing
applications
An overview of stream
processing approaches
Quix Streams — Kafka Summit 2023 | 18
Quix Streams — Kafka Summit 2023 | 19
When you building stream processing applications with Kafka, there are two
options:
1. Just build an application that uses the Kafka producer and consumer APIs
directly
2. Adopt a full-fledged stream processing framework (Flink, Spark streaming,
Beam etc.)
Stream processing applications
Quix Streams — Kafka Summit 2023 | 20
● Works for simple stuff like one-message-at-a-time processing
● No external dependencies like JVM
● Gets very complicated when stateful processing is needed like calculation
aggregations or joining multiple streams
Kafka producer and consumer APIs
Quix Streams — Kafka Summit 2023 | 21
● Fully fledged stream processing frameworks solves stateful,
more complex operations
● But it is for a cost of increased complexity in many dimensions:
○ Java dependency
○ Deployment gets difficult because code is not running on its own but
in server side cluster (Flink cluster or Spark cluster)
○ Debugging is difficult
○ Performance optimization is difficult
○ Gets even worse when we combine synchronous architecture with
asynchronous in one application
Stream processing frameworks
Quix Streams — Kafka Summit 2023 | 22
JAR files…
Quix Streams — Kafka Summit 2023 | 23
Connecting Flink to Kafka is difficult
Quix Streams — Kafka Summit 2023 | 24
SQL looks easy to use but…
Quix Streams — Kafka Summit 2023 | 25
● Poor development experience
○ Logs only accessible from server, no debugging possible
● Performance hit caused by interface between JVM and Python
UDFs are nasty
Quix Streams — Kafka Summit 2023 | 26
DEBUGGING!!! 🐛🐛🐛
Quix Streams — Kafka Summit 2023 | 27
● Combining Kafka API approach with stream processing library
● Abstraction from key-value messages of Kafka API to virtual tables
● Standalone library that runs:
○ Locally for development and debugging
○ In docker or in Kubernetes for production deployments at scale
Is there a third way?
Quix Streams — Kafka Summit 2023 | 28
1. Messages in topic 2. Split messages into individual streams
4. Messages decomposed into rows
5. Memory state
updated from
incoming rows/series
6. State persistence
3. Message converted to tables
7. State and incoming data
is combined to output that
is sent to output topic
Commit offsets
Stateful processing with Pub & Sub client libraries
Quix Streams — Kafka Summit 2023 | 29
Quix Streams
1. Messages in topic 2. Messages decomposed as
rows available via pandas API
3. Messages processed
through pipeline defined as
pandas operations. Output
streamed to output topic.
● Automatic state management
● Automatic checkpointing
● Automatic message serialization/deserialization
Quix Streams — Kafka Summit 2023 | 30
How it works
Kafka + Kubernetes + Python
Quix Streams — Kafka Summit 2023 | 30
Quix Streams — Kafka Summit 2023 | 31
Our approach to stream processing
Containers
Containers running in
Kubernetes scaling hand
to hand with Kafka for
compute scalability.
Kafka
Handle your data reliably
and efficiently in memory
with Kafka. Using Kafka
partitions, replica system and
persistence to deliver
scalability and robustness.
Python
Python gives you flexibility.
It lets you transform data,
not just query it. From simple
filtering to ML use cases like
video processing.
Quix Streams — Kafka Summit 2023 | 32
Processing with streaming
SUB
gForce
X
gForce
Y
gForce
Z
0.5 0.3 0.1
gForce
X
gForce
Y
gForce
Z
gForce
Total
Crash
0.5 0.3 0.1 0.9 1
INPUT TOPIC
APP
OUTPUT TOPIC
PUB
Quix Streams — Kafka Summit 2023 | 33
Scale
SUB
gForce
X
gForce
Y
gForce
Z
0.5 0.3 0.1
gForce
X
gForce
Y
gForce
Z
gForce
Total
Crash
0.5 0.3 0.1 0.9 1
INPUT TOPIC OUTPUT TOPIC
PUB
Quix Streams — Kafka Summit 2023 | 34
Fault tolerant
SUB
gForce
X
gForce
Y
gForce
Z
0.5 0.3 0.1
gForce
X
gForce
Y
gForce
Z
gForce
Total
Crash
0.5 0.3 0.1 0.9 1
INPUT TOPIC OUTPUT TOPIC
PUB
Quix Streams — Kafka Summit 2023 | 35
Let’s build it!
Demo
Quix Streams — Kafka Summit 2023 | 35
Quix Streams — Kafka Summit 2023 | 36
GitHub
Try Quix Streams
Quix Streams — Kafka Summit 2023 | 37
37
info@quix.io | www.quix.io
Thank you
1 de 37

Recomendados

Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S... por
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...Anant Corporation
19 visualizações33 slides
Beyond the Brokers: A Tour of the Kafka Ecosystem por
Beyond the Brokers: A Tour of the Kafka EcosystemBeyond the Brokers: A Tour of the Kafka Ecosystem
Beyond the Brokers: A Tour of the Kafka Ecosystemconfluent
780 visualizações77 slides
Beyond the brokers - A tour of the Kafka ecosystem por
Beyond the brokers - A tour of the Kafka ecosystemBeyond the brokers - A tour of the Kafka ecosystem
Beyond the brokers - A tour of the Kafka ecosystemDamien Gasparina
613 visualizações77 slides
Confluent Operator as Cloud-Native Kafka Operator for Kubernetes por
Confluent Operator as Cloud-Native Kafka Operator for KubernetesConfluent Operator as Cloud-Native Kafka Operator for Kubernetes
Confluent Operator as Cloud-Native Kafka Operator for KubernetesKai Wähner
6.1K visualizações36 slides
Beyond the brokers - Un tour de l'écosystème Kafka por
Beyond the brokers - Un tour de l'écosystème KafkaBeyond the brokers - Un tour de l'écosystème Kafka
Beyond the brokers - Un tour de l'écosystème KafkaFlorent Ramiere
783 visualizações77 slides
Au delà des brokers, un tour de l’environnement Kafka | Florent Ramière por
Au delà des brokers, un tour de l’environnement Kafka | Florent RamièreAu delà des brokers, un tour de l’environnement Kafka | Florent Ramière
Au delà des brokers, un tour de l’environnement Kafka | Florent Ramièreconfluent
317 visualizações58 slides

Mais conteúdo relacionado

Similar a Case-Study: Building Real-Time Applications at Scale-Cyclist Crash Detection with Tomas Neubauer

Kafka Explainaton por
Kafka ExplainatonKafka Explainaton
Kafka ExplainatonNguyenChiHoangMinh
15 visualizações53 slides
All Streams Ahead! ksqlDB Workshop ANZ por
All Streams Ahead! ksqlDB Workshop ANZAll Streams Ahead! ksqlDB Workshop ANZ
All Streams Ahead! ksqlDB Workshop ANZconfluent
109 visualizações46 slides
Learnings From Shipping 1000+ Streaming Data Pipelines To Production with Hak... por
Learnings From Shipping 1000+ Streaming Data Pipelines To Production with Hak...Learnings From Shipping 1000+ Streaming Data Pipelines To Production with Hak...
Learnings From Shipping 1000+ Streaming Data Pipelines To Production with Hak...HostedbyConfluent
175 visualizações50 slides
Day in the life event-driven workshop por
Day in the life  event-driven workshopDay in the life  event-driven workshop
Day in the life event-driven workshopChristina Lin
217 visualizações97 slides
Top 5 Event Streaming Use Cases for 2021 with Apache Kafka por
Top 5 Event Streaming Use Cases for 2021 with Apache KafkaTop 5 Event Streaming Use Cases for 2021 with Apache Kafka
Top 5 Event Streaming Use Cases for 2021 with Apache KafkaKai Wähner
4.5K visualizações59 slides
The Top 5 Event Streaming Use Cases & Architectures in 2021 por
The Top 5 Event Streaming Use Cases & Architectures in 2021The Top 5 Event Streaming Use Cases & Architectures in 2021
The Top 5 Event Streaming Use Cases & Architectures in 2021confluent
6.7K visualizações59 slides

Similar a Case-Study: Building Real-Time Applications at Scale-Cyclist Crash Detection with Tomas Neubauer(20)

All Streams Ahead! ksqlDB Workshop ANZ por confluent
All Streams Ahead! ksqlDB Workshop ANZAll Streams Ahead! ksqlDB Workshop ANZ
All Streams Ahead! ksqlDB Workshop ANZ
confluent109 visualizações
Learnings From Shipping 1000+ Streaming Data Pipelines To Production with Hak... por HostedbyConfluent
Learnings From Shipping 1000+ Streaming Data Pipelines To Production with Hak...Learnings From Shipping 1000+ Streaming Data Pipelines To Production with Hak...
Learnings From Shipping 1000+ Streaming Data Pipelines To Production with Hak...
HostedbyConfluent175 visualizações
Day in the life event-driven workshop por Christina Lin
Day in the life  event-driven workshopDay in the life  event-driven workshop
Day in the life event-driven workshop
Christina Lin217 visualizações
Top 5 Event Streaming Use Cases for 2021 with Apache Kafka por Kai Wähner
Top 5 Event Streaming Use Cases for 2021 with Apache KafkaTop 5 Event Streaming Use Cases for 2021 with Apache Kafka
Top 5 Event Streaming Use Cases for 2021 with Apache Kafka
Kai Wähner4.5K visualizações
The Top 5 Event Streaming Use Cases & Architectures in 2021 por confluent
The Top 5 Event Streaming Use Cases & Architectures in 2021The Top 5 Event Streaming Use Cases & Architectures in 2021
The Top 5 Event Streaming Use Cases & Architectures in 2021
confluent6.7K visualizações
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d... por Kai Wähner
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Kai Wähner45.1K visualizações
Apache Kafka + Apache Mesos + Kafka Streams - Highly Scalable Streaming Micro... por Kai Wähner
Apache Kafka + Apache Mesos + Kafka Streams - Highly Scalable Streaming Micro...Apache Kafka + Apache Mesos + Kafka Streams - Highly Scalable Streaming Micro...
Apache Kafka + Apache Mesos + Kafka Streams - Highly Scalable Streaming Micro...
Kai Wähner3.3K visualizações
Conclusion Code Cafe - Microcks for Mocking and Testing Async APIs (January 2... por Lucas Jellema
Conclusion Code Cafe - Microcks for Mocking and Testing Async APIs (January 2...Conclusion Code Cafe - Microcks for Mocking and Testing Async APIs (January 2...
Conclusion Code Cafe - Microcks for Mocking and Testing Async APIs (January 2...
Lucas Jellema416 visualizações
Building Real-time Pipelines with FLaNK_ A Case Study with Transit Data por Timothy Spann
Building Real-time Pipelines with FLaNK_ A Case Study with Transit DataBuilding Real-time Pipelines with FLaNK_ A Case Study with Transit Data
Building Real-time Pipelines with FLaNK_ A Case Study with Transit Data
Timothy Spann193 visualizações
Apache Kafka - Scalable Message Processing and more! por Guido Schmutz
Apache Kafka - Scalable Message Processing and more!Apache Kafka - Scalable Message Processing and more!
Apache Kafka - Scalable Message Processing and more!
Guido Schmutz704 visualizações
Processing IoT Data from End to End with MQTT and Apache Kafka por confluent
Processing IoT Data from End to End with MQTT and Apache Kafka Processing IoT Data from End to End with MQTT and Apache Kafka
Processing IoT Data from End to End with MQTT and Apache Kafka
confluent11.9K visualizações
Concepts and Patterns for Streaming Services with Kafka por QAware GmbH
Concepts and Patterns for Streaming Services with KafkaConcepts and Patterns for Streaming Services with Kafka
Concepts and Patterns for Streaming Services with Kafka
QAware GmbH524 visualizações
Introducing Kafka's Streams API por confluent
Introducing Kafka's Streams APIIntroducing Kafka's Streams API
Introducing Kafka's Streams API
confluent4.9K visualizações
Building streaming data applications using Kafka*[Connect + Core + Streams] b... por Data Con LA
Building streaming data applications using Kafka*[Connect + Core + Streams] b...Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Data Con LA352 visualizações
Devoxx university - Kafka de haut en bas por Florent Ramiere
Devoxx university - Kafka de haut en basDevoxx university - Kafka de haut en bas
Devoxx university - Kafka de haut en bas
Florent Ramiere1.9K visualizações
Building Streaming Data Applications Using Apache Kafka por Slim Baltagi
Building Streaming Data Applications Using Apache KafkaBuilding Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache Kafka
Slim Baltagi7.9K visualizações
Bridge to Cloud: Using Apache Kafka to Migrate to AWS por confluent
Bridge to Cloud: Using Apache Kafka to Migrate to AWSBridge to Cloud: Using Apache Kafka to Migrate to AWS
Bridge to Cloud: Using Apache Kafka to Migrate to AWS
confluent5.4K visualizações
Integrating Apache Kafka Into Your Environment por confluent
Integrating Apache Kafka Into Your EnvironmentIntegrating Apache Kafka Into Your Environment
Integrating Apache Kafka Into Your Environment
confluent3.9K visualizações
Kafka Practices @ Uber - Seattle Apache Kafka meetup por Mingmin Chen
Kafka Practices @ Uber - Seattle Apache Kafka meetupKafka Practices @ Uber - Seattle Apache Kafka meetup
Kafka Practices @ Uber - Seattle Apache Kafka meetup
Mingmin Chen1.6K visualizações

Mais de HostedbyConfluent

Build Real-time Machine Learning Apps on Generative AI with Kafka Streams por
Build Real-time Machine Learning Apps on Generative AI with Kafka StreamsBuild Real-time Machine Learning Apps on Generative AI with Kafka Streams
Build Real-time Machine Learning Apps on Generative AI with Kafka StreamsHostedbyConfluent
77 visualizações26 slides
When Only the Last Writer Wins We All Lose: Active-Active Geo-Replication in ... por
When Only the Last Writer Wins We All Lose: Active-Active Geo-Replication in ...When Only the Last Writer Wins We All Lose: Active-Active Geo-Replication in ...
When Only the Last Writer Wins We All Lose: Active-Active Geo-Replication in ...HostedbyConfluent
33 visualizações84 slides
Apache Kafka's Next-Gen Rebalance Protocol: Towards More Stable and Scalable ... por
Apache Kafka's Next-Gen Rebalance Protocol: Towards More Stable and Scalable ...Apache Kafka's Next-Gen Rebalance Protocol: Towards More Stable and Scalable ...
Apache Kafka's Next-Gen Rebalance Protocol: Towards More Stable and Scalable ...HostedbyConfluent
71 visualizações97 slides
Using Kafka at Scale - A Case Study of Micro Services Data Pipelines at Evern... por
Using Kafka at Scale - A Case Study of Micro Services Data Pipelines at Evern...Using Kafka at Scale - A Case Study of Micro Services Data Pipelines at Evern...
Using Kafka at Scale - A Case Study of Micro Services Data Pipelines at Evern...HostedbyConfluent
64 visualizações15 slides
Rule Based Asset Management Workflow Automation at Netflix por
Rule Based Asset Management Workflow Automation at NetflixRule Based Asset Management Workflow Automation at Netflix
Rule Based Asset Management Workflow Automation at NetflixHostedbyConfluent
40 visualizações56 slides
Scalable E-Commerce Data Pipelines with Kafka: Real-Time Analytics, Batch, ML... por
Scalable E-Commerce Data Pipelines with Kafka: Real-Time Analytics, Batch, ML...Scalable E-Commerce Data Pipelines with Kafka: Real-Time Analytics, Batch, ML...
Scalable E-Commerce Data Pipelines with Kafka: Real-Time Analytics, Batch, ML...HostedbyConfluent
65 visualizações32 slides

Mais de HostedbyConfluent(20)

Build Real-time Machine Learning Apps on Generative AI with Kafka Streams por HostedbyConfluent
Build Real-time Machine Learning Apps on Generative AI with Kafka StreamsBuild Real-time Machine Learning Apps on Generative AI with Kafka Streams
Build Real-time Machine Learning Apps on Generative AI with Kafka Streams
HostedbyConfluent77 visualizações
When Only the Last Writer Wins We All Lose: Active-Active Geo-Replication in ... por HostedbyConfluent
When Only the Last Writer Wins We All Lose: Active-Active Geo-Replication in ...When Only the Last Writer Wins We All Lose: Active-Active Geo-Replication in ...
When Only the Last Writer Wins We All Lose: Active-Active Geo-Replication in ...
HostedbyConfluent33 visualizações
Apache Kafka's Next-Gen Rebalance Protocol: Towards More Stable and Scalable ... por HostedbyConfluent
Apache Kafka's Next-Gen Rebalance Protocol: Towards More Stable and Scalable ...Apache Kafka's Next-Gen Rebalance Protocol: Towards More Stable and Scalable ...
Apache Kafka's Next-Gen Rebalance Protocol: Towards More Stable and Scalable ...
HostedbyConfluent71 visualizações
Using Kafka at Scale - A Case Study of Micro Services Data Pipelines at Evern... por HostedbyConfluent
Using Kafka at Scale - A Case Study of Micro Services Data Pipelines at Evern...Using Kafka at Scale - A Case Study of Micro Services Data Pipelines at Evern...
Using Kafka at Scale - A Case Study of Micro Services Data Pipelines at Evern...
HostedbyConfluent64 visualizações
Rule Based Asset Management Workflow Automation at Netflix por HostedbyConfluent
Rule Based Asset Management Workflow Automation at NetflixRule Based Asset Management Workflow Automation at Netflix
Rule Based Asset Management Workflow Automation at Netflix
HostedbyConfluent40 visualizações
Scalable E-Commerce Data Pipelines with Kafka: Real-Time Analytics, Batch, ML... por HostedbyConfluent
Scalable E-Commerce Data Pipelines with Kafka: Real-Time Analytics, Batch, ML...Scalable E-Commerce Data Pipelines with Kafka: Real-Time Analytics, Batch, ML...
Scalable E-Commerce Data Pipelines with Kafka: Real-Time Analytics, Batch, ML...
HostedbyConfluent65 visualizações
Indeed Flex: The Story of a Revolutionary Recruitment Platform por HostedbyConfluent
Indeed Flex: The Story of a Revolutionary Recruitment PlatformIndeed Flex: The Story of a Revolutionary Recruitment Platform
Indeed Flex: The Story of a Revolutionary Recruitment Platform
HostedbyConfluent40 visualizações
Forecasting Kafka Lag Issues with Machine Learning por HostedbyConfluent
Forecasting Kafka Lag Issues with Machine LearningForecasting Kafka Lag Issues with Machine Learning
Forecasting Kafka Lag Issues with Machine Learning
HostedbyConfluent31 visualizações
Getting Under the Hood of Kafka Streams: Optimizing Storage Engines to Tune U... por HostedbyConfluent
Getting Under the Hood of Kafka Streams: Optimizing Storage Engines to Tune U...Getting Under the Hood of Kafka Streams: Optimizing Storage Engines to Tune U...
Getting Under the Hood of Kafka Streams: Optimizing Storage Engines to Tune U...
HostedbyConfluent40 visualizações
Maximizing Real-Time Data Processing with Apache Kafka and InfluxDB: A Compre... por HostedbyConfluent
Maximizing Real-Time Data Processing with Apache Kafka and InfluxDB: A Compre...Maximizing Real-Time Data Processing with Apache Kafka and InfluxDB: A Compre...
Maximizing Real-Time Data Processing with Apache Kafka and InfluxDB: A Compre...
HostedbyConfluent45 visualizações
Accelerating Path to Production for Generative AI-powered Applications por HostedbyConfluent
Accelerating Path to Production for Generative AI-powered ApplicationsAccelerating Path to Production for Generative AI-powered Applications
Accelerating Path to Production for Generative AI-powered Applications
HostedbyConfluent70 visualizações
Optimize Costs and Scale Your Streaming Applications with Virtually Unlimited... por HostedbyConfluent
Optimize Costs and Scale Your Streaming Applications with Virtually Unlimited...Optimize Costs and Scale Your Streaming Applications with Virtually Unlimited...
Optimize Costs and Scale Your Streaming Applications with Virtually Unlimited...
HostedbyConfluent42 visualizações
Don’t Let Degradation Bring You Down: Automatically Detect & Remediate Degrad... por HostedbyConfluent
Don’t Let Degradation Bring You Down: Automatically Detect & Remediate Degrad...Don’t Let Degradation Bring You Down: Automatically Detect & Remediate Degrad...
Don’t Let Degradation Bring You Down: Automatically Detect & Remediate Degrad...
HostedbyConfluent57 visualizações
Streaming is a Detail por HostedbyConfluent
Streaming is a DetailStreaming is a Detail
Streaming is a Detail
HostedbyConfluent39 visualizações
Go Big or Go Home: Approaching Kafka Replication at Scale por HostedbyConfluent
Go Big or Go Home: Approaching Kafka Replication at ScaleGo Big or Go Home: Approaching Kafka Replication at Scale
Go Big or Go Home: Approaching Kafka Replication at Scale
HostedbyConfluent39 visualizações
What's in store? Part Deux; Creating Custom Queries with Kafka Streams IQv2 por HostedbyConfluent
What's in store? Part Deux; Creating Custom Queries with Kafka Streams IQv2What's in store? Part Deux; Creating Custom Queries with Kafka Streams IQv2
What's in store? Part Deux; Creating Custom Queries with Kafka Streams IQv2
HostedbyConfluent37 visualizações
A Trifecta of Real-Time Applications: Apache Kafka, Flink, and Druid por HostedbyConfluent
A Trifecta of Real-Time Applications: Apache Kafka, Flink, and DruidA Trifecta of Real-Time Applications: Apache Kafka, Flink, and Druid
A Trifecta of Real-Time Applications: Apache Kafka, Flink, and Druid
HostedbyConfluent89 visualizações
From Raw Data to an Interactive Data App in an Hour: Powered by Snowpark Python por HostedbyConfluent
From Raw Data to an Interactive Data App in an Hour: Powered by Snowpark PythonFrom Raw Data to an Interactive Data App in an Hour: Powered by Snowpark Python
From Raw Data to an Interactive Data App in an Hour: Powered by Snowpark Python
HostedbyConfluent83 visualizações
Beyond Monoliths: Thrivent’s Lessons in Building a Modern Integration Archite... por HostedbyConfluent
Beyond Monoliths: Thrivent’s Lessons in Building a Modern Integration Archite...Beyond Monoliths: Thrivent’s Lessons in Building a Modern Integration Archite...
Beyond Monoliths: Thrivent’s Lessons in Building a Modern Integration Archite...
HostedbyConfluent57 visualizações
Exactly-Once Semantics Revisited: Distributed Transactions across Flink and K... por HostedbyConfluent
Exactly-Once Semantics Revisited: Distributed Transactions across Flink and K...Exactly-Once Semantics Revisited: Distributed Transactions across Flink and K...
Exactly-Once Semantics Revisited: Distributed Transactions across Flink and K...
HostedbyConfluent74 visualizações

Último

"Surviving highload with Node.js", Andrii Shumada por
"Surviving highload with Node.js", Andrii Shumada "Surviving highload with Node.js", Andrii Shumada
"Surviving highload with Node.js", Andrii Shumada Fwdays
40 visualizações29 slides
Business Analyst Series 2023 - Week 3 Session 5 por
Business Analyst Series 2023 -  Week 3 Session 5Business Analyst Series 2023 -  Week 3 Session 5
Business Analyst Series 2023 - Week 3 Session 5DianaGray10
369 visualizações20 slides
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha... por
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...ShapeBlue
74 visualizações18 slides
Microsoft Power Platform.pptx por
Microsoft Power Platform.pptxMicrosoft Power Platform.pptx
Microsoft Power Platform.pptxUni Systems S.M.S.A.
67 visualizações38 slides
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit... por
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...ShapeBlue
57 visualizações25 slides
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue por
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlueShapeBlue
50 visualizações23 slides

Último(20)

"Surviving highload with Node.js", Andrii Shumada por Fwdays
"Surviving highload with Node.js", Andrii Shumada "Surviving highload with Node.js", Andrii Shumada
"Surviving highload with Node.js", Andrii Shumada
Fwdays40 visualizações
Business Analyst Series 2023 - Week 3 Session 5 por DianaGray10
Business Analyst Series 2023 -  Week 3 Session 5Business Analyst Series 2023 -  Week 3 Session 5
Business Analyst Series 2023 - Week 3 Session 5
DianaGray10369 visualizações
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha... por ShapeBlue
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
ShapeBlue74 visualizações
Microsoft Power Platform.pptx por Uni Systems S.M.S.A.
Microsoft Power Platform.pptxMicrosoft Power Platform.pptx
Microsoft Power Platform.pptx
Uni Systems S.M.S.A.67 visualizações
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit... por ShapeBlue
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
ShapeBlue57 visualizações
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue por ShapeBlue
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue
ShapeBlue50 visualizações
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue por ShapeBlue
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlueWhat’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue
ShapeBlue131 visualizações
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ... por ShapeBlue
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...
ShapeBlue65 visualizações
Keynote Talk: Open Source is Not Dead - Charles Schulz - Vates por ShapeBlue
Keynote Talk: Open Source is Not Dead - Charles Schulz - VatesKeynote Talk: Open Source is Not Dead - Charles Schulz - Vates
Keynote Talk: Open Source is Not Dead - Charles Schulz - Vates
ShapeBlue119 visualizações
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N... por James Anderson
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...
James Anderson133 visualizações
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f... por TrustArc
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc77 visualizações
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R... por ShapeBlue
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
ShapeBlue54 visualizações
NTGapps NTG LowCode Platform por Mustafa Kuğu
NTGapps NTG LowCode Platform NTGapps NTG LowCode Platform
NTGapps NTG LowCode Platform
Mustafa Kuğu141 visualizações
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue por ShapeBlue
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlueMigrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue
ShapeBlue96 visualizações
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or... por ShapeBlue
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
ShapeBlue88 visualizações
Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda... por ShapeBlue
Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda...Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda...
Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda...
ShapeBlue63 visualizações
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ... por Jasper Oosterveld
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...
Jasper Oosterveld28 visualizações
State of the Union - Rohit Yadav - Apache CloudStack por ShapeBlue
State of the Union - Rohit Yadav - Apache CloudStackState of the Union - Rohit Yadav - Apache CloudStack
State of the Union - Rohit Yadav - Apache CloudStack
ShapeBlue145 visualizações
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT por ShapeBlue
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBITUpdates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT
ShapeBlue91 visualizações
Scaling Knowledge Graph Architectures with AI por Enterprise Knowledge
Scaling Knowledge Graph Architectures with AIScaling Knowledge Graph Architectures with AI
Scaling Knowledge Graph Architectures with AI
Enterprise Knowledge53 visualizações

Case-Study: Building Real-Time Applications at Scale-Cyclist Crash Detection with Tomas Neubauer

  • 1. Quix Streams — Kafka Summit 2023 | 1 Quix Streams Building Real-Time Applications at Scale
  • 2. Quix Streams — Kafka Summit 2023 | 2 Tomas Neubauer Previously McLaren technical lead CTO & Co-founder, Quix Hello, nice to meet you! 👋
  • 3. Quix Streams — Kafka Summit 2023 | 3 Racing background Roots in real-time data processing in the most extreme, time-critical environment. ● 50,000 channels per car ● 1.5 kHz per channel ● 1,000s realtime models and simulations Quix Streams — Kafka Summit 2023 | 3
  • 4. Quix Streams — Kafka Summit 2023 | 4 Now raise your hand if you are using…
  • 5. Quix Streams — Kafka Summit 2023 | 5 Kafka Now raise your hand if you are using…
  • 6. Quix Streams — Kafka Summit 2023 | 6 Streaming Now raise your hand if you are using…
  • 7. Quix Streams — Kafka Summit 2023 | 7 Python Now raise your hand if you are using…
  • 8. Quix Streams — Kafka Summit 2023 | 8 Goal Crash detection phone-data crashes Fitness app
  • 9. Quix Streams — Kafka Summit 2023 | 9 ● Architecture ● ML deployment ● Streaming landscape ● How it works ● Demo - Let's build it Content
  • 10. Quix Streams — Kafka Summit 2023 | 10 ML Deployment Quix Streams — Kafka Summit 2023 | 10 REST API vs Streaming
  • 11. Quix Streams — Kafka Summit 2023 | 11 phone-data Websocket gateway alerts Websocket gateway ANALYSIS & TRAINING Trained model SageMaker Architecture Crash detection
  • 12. Quix Streams — Kafka Summit 2023 | 12 ML Deployment with API API REQUEST WEB API API RESPONSE gX gY gZ gTotal 0.5 0.3 0.1 0.9 gX gY gZ gTotal Crash 0.5 0.3 0.1 0.9 1 SERVICE
  • 13. Quix Streams — Kafka Summit 2023 | 13 Issues with REST APIs REST API vs Streaming Quix Streams — Kafka Summit 2023 | 13
  • 14. Quix Streams — Kafka Summit 2023 | 14 Problems with REST API API REQUEST gX gY gZ gTotal ● CPU overhead ● Introducing delay ● Requests gets lost in case of service downtime or slow performance WEB API SERVICE
  • 15. Quix Streams — Kafka Summit 2023 | 15 Problems with REST API gX gY gZ gTotal WEB API SERVICE API REQUEST
  • 16. Quix Streams — Kafka Summit 2023 | 16 Problems with REST API gX gY gZ gTotal WEB API SERVICE API REQUEST
  • 17. Quix Streams — Kafka Summit 2023 | 17 Problems with REST API API REQUEST gX gY gZ gTotal API REQUEST gX gY gZ gTotal WEB API SERVICE WEB API SERVICE
  • 18. Quix Streams — Kafka Summit 2023 | 18 Stream processing applications An overview of stream processing approaches Quix Streams — Kafka Summit 2023 | 18
  • 19. Quix Streams — Kafka Summit 2023 | 19 When you building stream processing applications with Kafka, there are two options: 1. Just build an application that uses the Kafka producer and consumer APIs directly 2. Adopt a full-fledged stream processing framework (Flink, Spark streaming, Beam etc.) Stream processing applications
  • 20. Quix Streams — Kafka Summit 2023 | 20 ● Works for simple stuff like one-message-at-a-time processing ● No external dependencies like JVM ● Gets very complicated when stateful processing is needed like calculation aggregations or joining multiple streams Kafka producer and consumer APIs
  • 21. Quix Streams — Kafka Summit 2023 | 21 ● Fully fledged stream processing frameworks solves stateful, more complex operations ● But it is for a cost of increased complexity in many dimensions: ○ Java dependency ○ Deployment gets difficult because code is not running on its own but in server side cluster (Flink cluster or Spark cluster) ○ Debugging is difficult ○ Performance optimization is difficult ○ Gets even worse when we combine synchronous architecture with asynchronous in one application Stream processing frameworks
  • 22. Quix Streams — Kafka Summit 2023 | 22 JAR files…
  • 23. Quix Streams — Kafka Summit 2023 | 23 Connecting Flink to Kafka is difficult
  • 24. Quix Streams — Kafka Summit 2023 | 24 SQL looks easy to use but…
  • 25. Quix Streams — Kafka Summit 2023 | 25 ● Poor development experience ○ Logs only accessible from server, no debugging possible ● Performance hit caused by interface between JVM and Python UDFs are nasty
  • 26. Quix Streams — Kafka Summit 2023 | 26 DEBUGGING!!! 🐛🐛🐛
  • 27. Quix Streams — Kafka Summit 2023 | 27 ● Combining Kafka API approach with stream processing library ● Abstraction from key-value messages of Kafka API to virtual tables ● Standalone library that runs: ○ Locally for development and debugging ○ In docker or in Kubernetes for production deployments at scale Is there a third way?
  • 28. Quix Streams — Kafka Summit 2023 | 28 1. Messages in topic 2. Split messages into individual streams 4. Messages decomposed into rows 5. Memory state updated from incoming rows/series 6. State persistence 3. Message converted to tables 7. State and incoming data is combined to output that is sent to output topic Commit offsets Stateful processing with Pub & Sub client libraries
  • 29. Quix Streams — Kafka Summit 2023 | 29 Quix Streams 1. Messages in topic 2. Messages decomposed as rows available via pandas API 3. Messages processed through pipeline defined as pandas operations. Output streamed to output topic. ● Automatic state management ● Automatic checkpointing ● Automatic message serialization/deserialization
  • 30. Quix Streams — Kafka Summit 2023 | 30 How it works Kafka + Kubernetes + Python Quix Streams — Kafka Summit 2023 | 30
  • 31. Quix Streams — Kafka Summit 2023 | 31 Our approach to stream processing Containers Containers running in Kubernetes scaling hand to hand with Kafka for compute scalability. Kafka Handle your data reliably and efficiently in memory with Kafka. Using Kafka partitions, replica system and persistence to deliver scalability and robustness. Python Python gives you flexibility. It lets you transform data, not just query it. From simple filtering to ML use cases like video processing.
  • 32. Quix Streams — Kafka Summit 2023 | 32 Processing with streaming SUB gForce X gForce Y gForce Z 0.5 0.3 0.1 gForce X gForce Y gForce Z gForce Total Crash 0.5 0.3 0.1 0.9 1 INPUT TOPIC APP OUTPUT TOPIC PUB
  • 33. Quix Streams — Kafka Summit 2023 | 33 Scale SUB gForce X gForce Y gForce Z 0.5 0.3 0.1 gForce X gForce Y gForce Z gForce Total Crash 0.5 0.3 0.1 0.9 1 INPUT TOPIC OUTPUT TOPIC PUB
  • 34. Quix Streams — Kafka Summit 2023 | 34 Fault tolerant SUB gForce X gForce Y gForce Z 0.5 0.3 0.1 gForce X gForce Y gForce Z gForce Total Crash 0.5 0.3 0.1 0.9 1 INPUT TOPIC OUTPUT TOPIC PUB
  • 35. Quix Streams — Kafka Summit 2023 | 35 Let’s build it! Demo Quix Streams — Kafka Summit 2023 | 35
  • 36. Quix Streams — Kafka Summit 2023 | 36 GitHub Try Quix Streams
  • 37. Quix Streams — Kafka Summit 2023 | 37 37 info@quix.io | www.quix.io Thank you