SlideShare uma empresa Scribd logo
1 de 20
© 2022 Bloomberg Finance L.P. All rights reserved.
Dynamic Rule-based
Real-time Market Data Alerts
Flink Forward San Francisco 2022
August 3, 2022
Madhuri Jain, Software Engineer
Ajay Vyasapeetam, Team Lead
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2022 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2022 Bloomberg Finance L.P. All rights reserved.
Outline
• Use case
• Architecture
• Implementation Detail
• Failure and Recovery
• Lessons Learnt
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2022 Bloomberg Finance L.P. All rights reserved.
Market Data Alerts
• What are these market data alerts?
▪ Market movements
▪ Quality and accuracy of data
▪ Other notable events -> trades, trading halts, changes in volatility, etc.
• Who cares about these market data alerts?
▪ Engineers / QC
▪ Bloomberg clients
• How do they want to be alerted?
▪ Email
▪ Tickets / Pager
▪ Messaging queue (Apache Kafka)
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2022 Bloomberg Finance L.P. All rights reserved.
Alert System Concepts
TradeEvent(
ticker: “AAPL”,
price: 100,
currency: Dollars,
timestamp: 1654818057,
)
Rule(
id:’UniqueRandomRuleId’,
Sql: ‘.......’,
Destination: ‘e-mail’
)
From
inputStream
select
ticker, price, timestamp
having
ticker=”AAPL”
and
price < 100
insert into
outputStream
• Events - what to alert
• Rules - when to alert
• Destinations - where to alert
© 2022 Bloomberg Finance L.P. All rights reserved.
Value alerts
Time
Price
Negative price
Alert
Time
Price
Latency Alert
© 2022 Bloomberg Finance L.P. All rights reserved.
Gap Detection
Time
Price
Gap Alert
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2022 Bloomberg Finance L.P. All rights reserved.
Time
Price
Spike Alert
Spike Detection
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2022 Bloomberg Finance L.P. All rights reserved.
Architecture
Flink Job
Rules UI
Rules
Manager
User
Database
Events
Requests Alerts
Sends
Rule Parameters,
Input Source
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2022 Bloomberg Finance L.P. All rights reserved.
Why Apache Flink?
• Allows stateful computations over unbounded datastreams
• Support for multiple sources and sinks
▪ Kafka, other message queues, Cassandra
• Broadcast State Pattern
• Robust Fault Tolerance
• Scalability
• Natural fit to build complex event processing layer
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2022 Bloomberg Finance L.P. All rights reserved.
Event Stream
Broadcast
Rule engine Alert Sink
Rule Stream
ruleId: 12345
Ticker: ABCD1
Price: 50
ruleId: 56789
Ticker: XYZ5
Price: 10
…
…
…
….
…
…
…..
(alert, e-mail)
(alert, Kafka)
(alert, ticket)
Flink
Broadcast
Rule Engine
Input/Output
Input Flink Job Output
Flink DSL
deriv_input_list.stream()
.filter(x -> x.getPrice() >= 0.5)
.map(x -> getAlertFields(x))
Flink CEP
DataStream<Event> input = ...;
Pattern<Event, ?> pattern =
Pattern.<Event>begin("start").where(
new SimpleCondition<Event>() {
@Override
public boolean filter(Event event) {
return event.getPrice() >= 0.5;
}
}
)
PatternStream<Event> patternStream =
CEP.pattern(input, pattern);
Code-based Rules
• Benefits
• Supported by Flink
• Supports complex rules
• Increased flexibility
• Cons
• Code changes for every user request
• Redeployment
• Increased code complexity
How can we represent rules?
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2022 Bloomberg Finance L.P. All rights reserved.
SQL-like Rules sent as events?
● Benefits
■ Easy to Reconstruct
■ No re-deployment
■ Supports complex queries
■ Support for multiple users
■ Reduced code complexity
● Cons
■ Learning curve
■ Debugging
SQL like?
FROM
inputStream
SELECT
ticker, price
HAVING
price >= 0.5
INSERT INTO
alertStream
Rule(
id:’UniqueRandomRuleId’,
Sql: ‘.......’,
Operation: CREATE
)
Rule(
id:’UniqueRandomRuleId’,
Operation: DISABLE
)
Siddhi: SQL-like rules sent as events!
Rule(
id:’UniqueRandomRuleId’,
Operation: ENABLE
)
© 2022 Bloomberg Finance L.P. All rights reserved.
Rule Engine - Siddhi
• Stream processing and complex event processing platform
• Provides Siddhi SQL
• Extension Framework
From
inputStream
select
ticker, price, timestamp
having
ticker=”IBM”
and
price > 40
insert into
outputStream
From
inputStream[outTimestamp - msgTimestamp
> 60]
select
ticker,
(outTimestamp - msgTimestamp) as latency
insert into
outputStream
From every
i1=inputStream[ticker=IBM]
-> not
inputStream[ticker=IBM]
For
20 sec
Select
‘ IBM’ as ticker, ‘Gap’ as alert
Insert into
outputStream
© 2022 Bloomberg Finance L.P. All rights reserved.
Flink-Siddhi
• Library to easily run Siddhi CEP within Flink streaming application
• Allows dynamic operations on rules
— Create / Update / Delete
— Enable / Disable
• Easy integration of Flink DataStream API with Siddhi CEP APIs
• Integration of Siddhi runtime state management with Flink state
© 2022 Bloomberg Finance L.P. All rights reserved.
Deserialize
Manage
Siddhi State
Match
rules
Siddhi
CEP Alert Sink
ruleId: 12345
Ticker: ABCD1
Price: 50
ruleId: 56789
Ticker: XYZ5
Price: 10
(alertMap, e-
mail)
(alertMap,
Kafka)
(alertMap,
ticket)
Initialize Flink
topology
Initial Steps
Siddhi Operator
Deserialize
Siddhi Operator
1
1
2
3
4
Register CEP
extension
Flink
Flink-Siddhi
Siddhi
Input/Output
Create: Rule1
Rule Stream
Input Flink Job Output
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2022 Bloomberg Finance L.P. All rights reserved.
Broadcast
Alert Sink
…
…
..
RuleId: ABCD1234
Operation: CREATE
SiddhiSQL: ‘From every
i1=inputStream[ticker=’IBM’]
-> not
inputStream[ticker=’IBM’]
For
20 sec
Select
‘IBM” as ticker, ‘Gap’ as alert
Insert into
outputStream;’
Destination:Kafka
Window reset
Window reset
Window reset
Window reset
GapRule window started
1
2
Alert Stream
2 1
Input Flink Job Output
Flink
Flink-Siddhi
Siddhi
Input/Output
Rule Stream
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2022 Bloomberg Finance L.P. All rights reserved.
Checkpointing and Savepointing
Flink-Siddhi
Flink Job
Job Metadata
Kafka Offsets Siddhi Runtime
SR1:
EP1
SR2:
EP1
SR3:
EP1
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2022 Bloomberg Finance L.P. All rights reserved.
Lessons Learnt
● Flattening input data to leverage simplicity of Siddhi SQL
● In-house feature development of open source Flink-Siddhi library
■ Flink and Siddhi version upgrades
■ Checkpointing/Savepointing for Siddhi runtime
● Understand limitations of SQL-like query and be ready to extend it
● Be aware of performance bottlenecks and explore the trade-offs that
can be made
© 2022 Bloomberg Finance L.P. All rights reserved.
Thank You!
Questions?
https://TechAtBloomberg.com
https://www.bloomberg.com/careers
Contact us:
Madhuri Jain (mjain189@bloomberg.net)
Ajay Vyasapeetam (avyasapeeta1@bloomberg.net)

Mais conteúdo relacionado

Mais procurados

Practical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobsPractical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobsFlink Forward
 
The top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scaleThe top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scaleFlink Forward
 
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangApache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangDatabricks
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiDatabricks
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeFlink Forward
 
Apache Flink and what it is used for
Apache Flink and what it is used forApache Flink and what it is used for
Apache Flink and what it is used forAljoscha Krettek
 
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational CacheUsing Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational CacheDremio Corporation
 
Apache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraApache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraFlink Forward
 
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity PlanningFrom Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planningconfluent
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsAlluxio, Inc.
 
Building real time analytics applications using pinot : A LinkedIn case study
Building real time analytics applications using pinot : A LinkedIn case studyBuilding real time analytics applications using pinot : A LinkedIn case study
Building real time analytics applications using pinot : A LinkedIn case studyKishore Gopalakrishna
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesDatabricks
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Flink Forward
 
Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Flink Forward
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkFlink Forward
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeDatabricks
 
Stream Processing with Flink and Stream Sharing
Stream Processing with Flink and Stream SharingStream Processing with Flink and Stream Sharing
Stream Processing with Flink and Stream Sharingconfluent
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka StreamsGuozhang Wang
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Flink Forward
 

Mais procurados (20)

Unified Stream and Batch Processing with Apache Flink
Unified Stream and Batch Processing with Apache FlinkUnified Stream and Batch Processing with Apache Flink
Unified Stream and Batch Processing with Apache Flink
 
Practical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobsPractical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobs
 
The top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scaleThe top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scale
 
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangApache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
 
Apache Flink and what it is used for
Apache Flink and what it is used forApache Flink and what it is used for
Apache Flink and what it is used for
 
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational CacheUsing Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
 
Apache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraApache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native Era
 
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity PlanningFrom Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
 
Building real time analytics applications using pinot : A LinkedIn case study
Building real time analytics applications using pinot : A LinkedIn case studyBuilding real time analytics applications using pinot : A LinkedIn case study
Building real time analytics applications using pinot : A LinkedIn case study
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
 
Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in Flink
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta Lake
 
Stream Processing with Flink and Stream Sharing
Stream Processing with Flink and Stream SharingStream Processing with Flink and Stream Sharing
Stream Processing with Flink and Stream Sharing
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka Streams
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
 

Semelhante a Dynamic Rule-based Real-time Market Data Alerts

Pivotal Digital Transformation Forum: Journey to Become a Data-Driven Enterprise
Pivotal Digital Transformation Forum: Journey to Become a Data-Driven EnterprisePivotal Digital Transformation Forum: Journey to Become a Data-Driven Enterprise
Pivotal Digital Transformation Forum: Journey to Become a Data-Driven EnterpriseVMware Tanzu
 
Real-Time Market Data Analytics Using Kafka Streams
Real-Time Market Data Analytics Using Kafka StreamsReal-Time Market Data Analytics Using Kafka Streams
Real-Time Market Data Analytics Using Kafka Streamsconfluent
 
IBM API Connect Deployment `Good Practices - IBM Think 2018
IBM API Connect Deployment `Good Practices - IBM Think 2018IBM API Connect Deployment `Good Practices - IBM Think 2018
IBM API Connect Deployment `Good Practices - IBM Think 2018Chris Phillips
 
Gimel at Dataworks Summit San Jose 2018
Gimel at Dataworks Summit San Jose 2018Gimel at Dataworks Summit San Jose 2018
Gimel at Dataworks Summit San Jose 2018Romit Mehta
 
Dataworks | 2018-06-20 | Gimel data platform
Dataworks | 2018-06-20 | Gimel data platformDataworks | 2018-06-20 | Gimel data platform
Dataworks | 2018-06-20 | Gimel data platformDeepak Chandramouli
 
The role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial InformaticsThe role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial InformaticsAerospike, Inc.
 
How Apache Spark and Apache Hadoop are being used to keep banking regulators ...
How Apache Spark and Apache Hadoop are being used to keep banking regulators ...How Apache Spark and Apache Hadoop are being used to keep banking regulators ...
How Apache Spark and Apache Hadoop are being used to keep banking regulators ...DataWorks Summit
 
How HSBC Uses Serverless to Process Millions of Transactions in Real Time (FS...
How HSBC Uses Serverless to Process Millions of Transactions in Real Time (FS...How HSBC Uses Serverless to Process Millions of Transactions in Real Time (FS...
How HSBC Uses Serverless to Process Millions of Transactions in Real Time (FS...Amazon Web Services
 
PSD2 & Open Banking: How to go from standards to implementation and compliance
PSD2 & Open Banking: How to go from standards to implementation and compliancePSD2 & Open Banking: How to go from standards to implementation and compliance
PSD2 & Open Banking: How to go from standards to implementation and complianceRogue Wave Software
 
QCon 2018 | Gimel | PayPal's Analytic Platform
QCon 2018 | Gimel | PayPal's Analytic PlatformQCon 2018 | Gimel | PayPal's Analytic Platform
QCon 2018 | Gimel | PayPal's Analytic PlatformDeepak Chandramouli
 
Stream Processing and Complex Event Processing together with Kafka, Flink and...
Stream Processing and Complex Event Processing together with Kafka, Flink and...Stream Processing and Complex Event Processing together with Kafka, Flink and...
Stream Processing and Complex Event Processing together with Kafka, Flink and...HostedbyConfluent
 
SMTAI PowerPoint: Blockchain for High Tech
SMTAI PowerPoint: Blockchain for High Tech SMTAI PowerPoint: Blockchain for High Tech
SMTAI PowerPoint: Blockchain for High Tech Quentin Samelson
 
Fully-Managed, Multi-Tenant Kafka Clusters: Tips, Tricks, and Tools (Christop...
Fully-Managed, Multi-Tenant Kafka Clusters: Tips, Tricks, and Tools (Christop...Fully-Managed, Multi-Tenant Kafka Clusters: Tips, Tricks, and Tools (Christop...
Fully-Managed, Multi-Tenant Kafka Clusters: Tips, Tricks, and Tools (Christop...confluent
 
New Approaches for Fraud Detection on Apache Kafka and KSQL
New Approaches for Fraud Detection on Apache Kafka and KSQLNew Approaches for Fraud Detection on Apache Kafka and KSQL
New Approaches for Fraud Detection on Apache Kafka and KSQLconfluent
 
Event-Driven Transformation in Banking and FSI
Event-Driven Transformation in Banking and FSIEvent-Driven Transformation in Banking and FSI
Event-Driven Transformation in Banking and FSISolace
 
Real-time processing of large amounts of data
Real-time processing of large amounts of dataReal-time processing of large amounts of data
Real-time processing of large amounts of dataconfluent
 
Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...
Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...
Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...Flink Forward
 
Help, My Kafka is Broken! (Emma Humber & Gantigmaa Selenge, IBM) Kafka Summit...
Help, My Kafka is Broken! (Emma Humber & Gantigmaa Selenge, IBM) Kafka Summit...Help, My Kafka is Broken! (Emma Humber & Gantigmaa Selenge, IBM) Kafka Summit...
Help, My Kafka is Broken! (Emma Humber & Gantigmaa Selenge, IBM) Kafka Summit...HostedbyConfluent
 
The Current And Future State Of Service Mesh
The Current And Future State Of Service MeshThe Current And Future State Of Service Mesh
The Current And Future State Of Service MeshRam Vennam
 

Semelhante a Dynamic Rule-based Real-time Market Data Alerts (20)

Pivotal Digital Transformation Forum: Journey to Become a Data-Driven Enterprise
Pivotal Digital Transformation Forum: Journey to Become a Data-Driven EnterprisePivotal Digital Transformation Forum: Journey to Become a Data-Driven Enterprise
Pivotal Digital Transformation Forum: Journey to Become a Data-Driven Enterprise
 
Real-Time Market Data Analytics Using Kafka Streams
Real-Time Market Data Analytics Using Kafka StreamsReal-Time Market Data Analytics Using Kafka Streams
Real-Time Market Data Analytics Using Kafka Streams
 
IBM API Connect Deployment `Good Practices - IBM Think 2018
IBM API Connect Deployment `Good Practices - IBM Think 2018IBM API Connect Deployment `Good Practices - IBM Think 2018
IBM API Connect Deployment `Good Practices - IBM Think 2018
 
Gimel at Dataworks Summit San Jose 2018
Gimel at Dataworks Summit San Jose 2018Gimel at Dataworks Summit San Jose 2018
Gimel at Dataworks Summit San Jose 2018
 
Dataworks | 2018-06-20 | Gimel data platform
Dataworks | 2018-06-20 | Gimel data platformDataworks | 2018-06-20 | Gimel data platform
Dataworks | 2018-06-20 | Gimel data platform
 
The role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial InformaticsThe role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial Informatics
 
How Apache Spark and Apache Hadoop are being used to keep banking regulators ...
How Apache Spark and Apache Hadoop are being used to keep banking regulators ...How Apache Spark and Apache Hadoop are being used to keep banking regulators ...
How Apache Spark and Apache Hadoop are being used to keep banking regulators ...
 
Psd2 challenges
Psd2 challenges Psd2 challenges
Psd2 challenges
 
How HSBC Uses Serverless to Process Millions of Transactions in Real Time (FS...
How HSBC Uses Serverless to Process Millions of Transactions in Real Time (FS...How HSBC Uses Serverless to Process Millions of Transactions in Real Time (FS...
How HSBC Uses Serverless to Process Millions of Transactions in Real Time (FS...
 
PSD2 & Open Banking: How to go from standards to implementation and compliance
PSD2 & Open Banking: How to go from standards to implementation and compliancePSD2 & Open Banking: How to go from standards to implementation and compliance
PSD2 & Open Banking: How to go from standards to implementation and compliance
 
QCon 2018 | Gimel | PayPal's Analytic Platform
QCon 2018 | Gimel | PayPal's Analytic PlatformQCon 2018 | Gimel | PayPal's Analytic Platform
QCon 2018 | Gimel | PayPal's Analytic Platform
 
Stream Processing and Complex Event Processing together with Kafka, Flink and...
Stream Processing and Complex Event Processing together with Kafka, Flink and...Stream Processing and Complex Event Processing together with Kafka, Flink and...
Stream Processing and Complex Event Processing together with Kafka, Flink and...
 
SMTAI PowerPoint: Blockchain for High Tech
SMTAI PowerPoint: Blockchain for High Tech SMTAI PowerPoint: Blockchain for High Tech
SMTAI PowerPoint: Blockchain for High Tech
 
Fully-Managed, Multi-Tenant Kafka Clusters: Tips, Tricks, and Tools (Christop...
Fully-Managed, Multi-Tenant Kafka Clusters: Tips, Tricks, and Tools (Christop...Fully-Managed, Multi-Tenant Kafka Clusters: Tips, Tricks, and Tools (Christop...
Fully-Managed, Multi-Tenant Kafka Clusters: Tips, Tricks, and Tools (Christop...
 
New Approaches for Fraud Detection on Apache Kafka and KSQL
New Approaches for Fraud Detection on Apache Kafka and KSQLNew Approaches for Fraud Detection on Apache Kafka and KSQL
New Approaches for Fraud Detection on Apache Kafka and KSQL
 
Event-Driven Transformation in Banking and FSI
Event-Driven Transformation in Banking and FSIEvent-Driven Transformation in Banking and FSI
Event-Driven Transformation in Banking and FSI
 
Real-time processing of large amounts of data
Real-time processing of large amounts of dataReal-time processing of large amounts of data
Real-time processing of large amounts of data
 
Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...
Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...
Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...
 
Help, My Kafka is Broken! (Emma Humber & Gantigmaa Selenge, IBM) Kafka Summit...
Help, My Kafka is Broken! (Emma Humber & Gantigmaa Selenge, IBM) Kafka Summit...Help, My Kafka is Broken! (Emma Humber & Gantigmaa Selenge, IBM) Kafka Summit...
Help, My Kafka is Broken! (Emma Humber & Gantigmaa Selenge, IBM) Kafka Summit...
 
The Current And Future State Of Service Mesh
The Current And Future State Of Service MeshThe Current And Future State Of Service Mesh
The Current And Future State Of Service Mesh
 

Mais de Flink Forward

Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Flink Forward
 
One sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkOne sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkFlink Forward
 
Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxFlink Forward
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink Forward
 
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentUsing the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentFlink Forward
 
The Current State of Table API in 2022
The Current State of Table API in 2022The Current State of Table API in 2022
The Current State of Table API in 2022Flink Forward
 
Flink SQL on Pulsar made easy
Flink SQL on Pulsar made easyFlink SQL on Pulsar made easy
Flink SQL on Pulsar made easyFlink Forward
 
Processing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesProcessing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesFlink Forward
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergFlink Forward
 
Welcome to the Flink Community!
Welcome to the Flink Community!Welcome to the Flink Community!
Welcome to the Flink Community!Flink Forward
 
Extending Flink SQL for stream processing use cases
Extending Flink SQL for stream processing use casesExtending Flink SQL for stream processing use cases
Extending Flink SQL for stream processing use casesFlink Forward
 
Using Queryable State for Fun and Profit
Using Queryable State for Fun and ProfitUsing Queryable State for Fun and Profit
Using Queryable State for Fun and ProfitFlink Forward
 
Changelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache FlinkChangelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache FlinkFlink Forward
 
Large Scale Real Time Fraudulent Web Behavior Detection
Large Scale Real Time Fraudulent Web Behavior DetectionLarge Scale Real Time Fraudulent Web Behavior Detection
Large Scale Real Time Fraudulent Web Behavior DetectionFlink Forward
 
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Flink Forward
 
Building Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta LakeBuilding Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta LakeFlink Forward
 
Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!Flink Forward
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiFlink Forward
 

Mais de Flink Forward (18)

Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
 
One sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkOne sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async Sink
 
Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptx
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at Pinterest
 
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentUsing the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production Deployment
 
The Current State of Table API in 2022
The Current State of Table API in 2022The Current State of Table API in 2022
The Current State of Table API in 2022
 
Flink SQL on Pulsar made easy
Flink SQL on Pulsar made easyFlink SQL on Pulsar made easy
Flink SQL on Pulsar made easy
 
Processing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesProcessing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial Services
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & Iceberg
 
Welcome to the Flink Community!
Welcome to the Flink Community!Welcome to the Flink Community!
Welcome to the Flink Community!
 
Extending Flink SQL for stream processing use cases
Extending Flink SQL for stream processing use casesExtending Flink SQL for stream processing use cases
Extending Flink SQL for stream processing use cases
 
Using Queryable State for Fun and Profit
Using Queryable State for Fun and ProfitUsing Queryable State for Fun and Profit
Using Queryable State for Fun and Profit
 
Changelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache FlinkChangelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache Flink
 
Large Scale Real Time Fraudulent Web Behavior Detection
Large Scale Real Time Fraudulent Web Behavior DetectionLarge Scale Real Time Fraudulent Web Behavior Detection
Large Scale Real Time Fraudulent Web Behavior Detection
 
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
 
Building Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta LakeBuilding Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta Lake
 
Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
 

Último

unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 

Último (20)

unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 

Dynamic Rule-based Real-time Market Data Alerts

  • 1. © 2022 Bloomberg Finance L.P. All rights reserved. Dynamic Rule-based Real-time Market Data Alerts Flink Forward San Francisco 2022 August 3, 2022 Madhuri Jain, Software Engineer Ajay Vyasapeetam, Team Lead
  • 2. © 2018 Bloomberg Finance L.P. All rights reserved. © 2022 Bloomberg Finance L.P. All rights reserved.
  • 3. © 2018 Bloomberg Finance L.P. All rights reserved. © 2022 Bloomberg Finance L.P. All rights reserved. Outline • Use case • Architecture • Implementation Detail • Failure and Recovery • Lessons Learnt
  • 4. © 2018 Bloomberg Finance L.P. All rights reserved. © 2022 Bloomberg Finance L.P. All rights reserved. Market Data Alerts • What are these market data alerts? ▪ Market movements ▪ Quality and accuracy of data ▪ Other notable events -> trades, trading halts, changes in volatility, etc. • Who cares about these market data alerts? ▪ Engineers / QC ▪ Bloomberg clients • How do they want to be alerted? ▪ Email ▪ Tickets / Pager ▪ Messaging queue (Apache Kafka)
  • 5. © 2018 Bloomberg Finance L.P. All rights reserved. © 2022 Bloomberg Finance L.P. All rights reserved. Alert System Concepts TradeEvent( ticker: “AAPL”, price: 100, currency: Dollars, timestamp: 1654818057, ) Rule( id:’UniqueRandomRuleId’, Sql: ‘.......’, Destination: ‘e-mail’ ) From inputStream select ticker, price, timestamp having ticker=”AAPL” and price < 100 insert into outputStream • Events - what to alert • Rules - when to alert • Destinations - where to alert
  • 6. © 2022 Bloomberg Finance L.P. All rights reserved. Value alerts Time Price Negative price Alert Time Price Latency Alert
  • 7. © 2022 Bloomberg Finance L.P. All rights reserved. Gap Detection Time Price Gap Alert
  • 8. © 2018 Bloomberg Finance L.P. All rights reserved. © 2022 Bloomberg Finance L.P. All rights reserved. Time Price Spike Alert Spike Detection
  • 9. © 2018 Bloomberg Finance L.P. All rights reserved. © 2022 Bloomberg Finance L.P. All rights reserved. Architecture Flink Job Rules UI Rules Manager User Database Events Requests Alerts Sends Rule Parameters, Input Source
  • 10. © 2018 Bloomberg Finance L.P. All rights reserved. © 2022 Bloomberg Finance L.P. All rights reserved. Why Apache Flink? • Allows stateful computations over unbounded datastreams • Support for multiple sources and sinks ▪ Kafka, other message queues, Cassandra • Broadcast State Pattern • Robust Fault Tolerance • Scalability • Natural fit to build complex event processing layer
  • 11. © 2018 Bloomberg Finance L.P. All rights reserved. © 2022 Bloomberg Finance L.P. All rights reserved. Event Stream Broadcast Rule engine Alert Sink Rule Stream ruleId: 12345 Ticker: ABCD1 Price: 50 ruleId: 56789 Ticker: XYZ5 Price: 10 … … … …. … … ….. (alert, e-mail) (alert, Kafka) (alert, ticket) Flink Broadcast Rule Engine Input/Output Input Flink Job Output
  • 12. Flink DSL deriv_input_list.stream() .filter(x -> x.getPrice() >= 0.5) .map(x -> getAlertFields(x)) Flink CEP DataStream<Event> input = ...; Pattern<Event, ?> pattern = Pattern.<Event>begin("start").where( new SimpleCondition<Event>() { @Override public boolean filter(Event event) { return event.getPrice() >= 0.5; } } ) PatternStream<Event> patternStream = CEP.pattern(input, pattern); Code-based Rules • Benefits • Supported by Flink • Supports complex rules • Increased flexibility • Cons • Code changes for every user request • Redeployment • Increased code complexity How can we represent rules?
  • 13. © 2018 Bloomberg Finance L.P. All rights reserved. © 2022 Bloomberg Finance L.P. All rights reserved. SQL-like Rules sent as events? ● Benefits ■ Easy to Reconstruct ■ No re-deployment ■ Supports complex queries ■ Support for multiple users ■ Reduced code complexity ● Cons ■ Learning curve ■ Debugging SQL like? FROM inputStream SELECT ticker, price HAVING price >= 0.5 INSERT INTO alertStream Rule( id:’UniqueRandomRuleId’, Sql: ‘.......’, Operation: CREATE ) Rule( id:’UniqueRandomRuleId’, Operation: DISABLE ) Siddhi: SQL-like rules sent as events! Rule( id:’UniqueRandomRuleId’, Operation: ENABLE )
  • 14. © 2022 Bloomberg Finance L.P. All rights reserved. Rule Engine - Siddhi • Stream processing and complex event processing platform • Provides Siddhi SQL • Extension Framework From inputStream select ticker, price, timestamp having ticker=”IBM” and price > 40 insert into outputStream From inputStream[outTimestamp - msgTimestamp > 60] select ticker, (outTimestamp - msgTimestamp) as latency insert into outputStream From every i1=inputStream[ticker=IBM] -> not inputStream[ticker=IBM] For 20 sec Select ‘ IBM’ as ticker, ‘Gap’ as alert Insert into outputStream
  • 15. © 2022 Bloomberg Finance L.P. All rights reserved. Flink-Siddhi • Library to easily run Siddhi CEP within Flink streaming application • Allows dynamic operations on rules — Create / Update / Delete — Enable / Disable • Easy integration of Flink DataStream API with Siddhi CEP APIs • Integration of Siddhi runtime state management with Flink state
  • 16. © 2022 Bloomberg Finance L.P. All rights reserved. Deserialize Manage Siddhi State Match rules Siddhi CEP Alert Sink ruleId: 12345 Ticker: ABCD1 Price: 50 ruleId: 56789 Ticker: XYZ5 Price: 10 (alertMap, e- mail) (alertMap, Kafka) (alertMap, ticket) Initialize Flink topology Initial Steps Siddhi Operator Deserialize Siddhi Operator 1 1 2 3 4 Register CEP extension Flink Flink-Siddhi Siddhi Input/Output Create: Rule1 Rule Stream Input Flink Job Output
  • 17. © 2018 Bloomberg Finance L.P. All rights reserved. © 2022 Bloomberg Finance L.P. All rights reserved. Broadcast Alert Sink … … .. RuleId: ABCD1234 Operation: CREATE SiddhiSQL: ‘From every i1=inputStream[ticker=’IBM’] -> not inputStream[ticker=’IBM’] For 20 sec Select ‘IBM” as ticker, ‘Gap’ as alert Insert into outputStream;’ Destination:Kafka Window reset Window reset Window reset Window reset GapRule window started 1 2 Alert Stream 2 1 Input Flink Job Output Flink Flink-Siddhi Siddhi Input/Output Rule Stream
  • 18. © 2018 Bloomberg Finance L.P. All rights reserved. © 2022 Bloomberg Finance L.P. All rights reserved. Checkpointing and Savepointing Flink-Siddhi Flink Job Job Metadata Kafka Offsets Siddhi Runtime SR1: EP1 SR2: EP1 SR3: EP1
  • 19. © 2018 Bloomberg Finance L.P. All rights reserved. © 2022 Bloomberg Finance L.P. All rights reserved. Lessons Learnt ● Flattening input data to leverage simplicity of Siddhi SQL ● In-house feature development of open source Flink-Siddhi library ■ Flink and Siddhi version upgrades ■ Checkpointing/Savepointing for Siddhi runtime ● Understand limitations of SQL-like query and be ready to extend it ● Be aware of performance bottlenecks and explore the trade-offs that can be made
  • 20. © 2022 Bloomberg Finance L.P. All rights reserved. Thank You! Questions? https://TechAtBloomberg.com https://www.bloomberg.com/careers Contact us: Madhuri Jain (mjain189@bloomberg.net) Ajay Vyasapeetam (avyasapeeta1@bloomberg.net)

Notas do Editor

  1. Hello everyone and welcome to our presentation. I'm Ajay and along with my colleague Madhuri, we'll talk about our Dynamic Rule-Based Real-time Alert System.
  2. Before we dive in, a brief introduction about Bloomberg. We're a financial, media and tech company that provides data and analytics to clients across the globe. We deal with hundreds of billions of events a day, ranging from news stories to stock trades. Our systems have to be highly scalable, very reliable, and low latency to satisfy our clients’ needs.
  3. In this presentation, I'll talk about our use case and Madhuri will dive into the architecture and technology and wrap it up with lessons learnt.
  4. What are Market Data Alerts? They can be market movements (e.g., AAPL shares are down) or measurements of the quality and accuracy of our data (e.g., we haven't published data in our real-time stream in the last hour) They could also be alerts on notable events like a new stock started trading today. Who is interested in these alerts? First are the Developers and QC teams, who are interested in finding out about issues with our systems. And then there are our external clients who want to be notified of any market events so they can make timely investment decisions. How are these alerts delivered? This could be in form of emails or pagers and even through message queues like Apache Kafka so external systems can react to these alerts.
  5. Before we dive deeper, I want to introduce some terminology we use within our system. Events are data on which we want to alert. For example, this could be things like trade events coming from a stock exchange. Rules define when we want to alert. For example, we might want to be alerted if AAPL stock drops below $100. These rules can be added, deleted and updated dynamically by users. Destinations define where these alerts should be sent. For example, this could be an email address or a Kafka topic.
  6. Now, let me talk about the different kinds of rules we currently support. The first type are value-based alerts. These could be rules like my time series has -ve values or there is high latency in the data being published. These rules don’t have any state and only apply the rule to the current event.
  7. The second type of rule is Gap detection. With this rule, we get alerted if we haven't been sending events within a given time window. This can be useful to find any issues in our system.
  8. The other rule we support is Spike detection. This can help us QC and generate alerts on the quality of our data. This rule utilizes state to know what the previous events were in the window to detect alerts. Now that we've covered these use cases, I'll hand it over to Madhuri to talk about our architecture.
  9. Let’s dive into the high-level architecture of our alerting system and look at each component. Rule UI As described in the previous slide, the user inputs -- Rules, Destinations, Input data source -- through the alerting UI These user requests are then forwarded to our Rules Manager. Rules Manager It’s a service which translates user requests to events/messages that will be consumed by our Flink job We also store these user requests in our database for compliance purposes. Flink Job: This is the heart of our alerting system. It consumes the rules and events, and processes them as per the alerting logic It then directs the output to the user-provided destination (in this diagram, it is an Apache Kafka topic)
  10. Why did we choose Apache Flink? Flink provides stateful computations onunbounded datastreams like checkpointing. It also provides you with local state like MapState and ListState for quicker access to data. We started off our application with source and destination being Kafka. However,we want to extend it in the future to support other input and output sources as well. You can broadcast the rules to all the parallel instances of an operator that will maintain these rules in state. Each event can then be processed in-parallel against these rules. Robust Fault Tolerance - It also provides robust fault tolerance through checkpointing and savepointing, which can help us recover from downtime. Scalability - Flink jobs can scale well. For example, you can set your own parallelism for your operators. Because of all of the above reasons, Flink seems like a natural fit for the CEP layer.
  11. Let’s zoom into the pieces of our Flink Job We receive our input events and rules through Kafka. Each rule has a rule id associated with it. The rules are then broadcast to all the parallel instances of an operator which then maintains it as state. The rule engine then applies each rule to every single event received and sends the output events accordingly to a datastream. Each ruleId has a destination associated with it. Alert Sink takes the alert messages and sends the alert to its respective destination based on ruleId.
  12. Let’s look at some of the ways we considered to represent our rules. Let’s say I want to be alerted when an event has a price >= 0.5 The easiest way to do that would be to stream the data, filter it based on your alerting condition, and map it to the fields needed for alerts. This is also easier to debug since the developer has control over the rules **CLICK** We could also use Flink CEP to implement some of our more complex rules like gap detection. It is a library implemented on top of Flink. It also allows more flexibility in terms of how you want to process your events - either based on event time or processing time. However, this would be a little more difficult to cater to all of the user needs: For example, if a user wants to change the condition, they want to be alerted on price > 0.8? This would require code changes. All of these code changes require a newer redeployment, which is not a very friendly approach. If the fields in the input data change, our system will have to be made aware of it. What if the user wants to disable the rule for a certain period of time, but not delete it? These ways didn’t seem to be a good fit and so we continued to look for better ways to represent rules to support our needs.
  13. We continued to look into different rule engines and wanted something that would require little to no code changes in our Flink job. What if we could represent rules as SQL-like queries? The rules can then be sent through a Kafka topic to our Flink job. This is not to be confused with Flink SQL! The benefits that we could achieve from SQL-like rules were tremendous compared to our previous options: The rules would be more readable and would be easier to reconstruct later on Since the rules are sent as events, we would not need any code changes and therefore would not need any redeployments based on user rules. This could also be a great choice if we could represent complex stateful queries like gap detection and spike detection. This will not only help our system, but also will open it up for other users (not just developers) This would also give the users an increased flexibility for disabling, enabling, deleting, or updating the rules as per their changing requirements. **CLICK**, **CLICK**, **CLICK** All in all, this fit our use case well, so we decided to move ahead with a SQL-like rule engine called SIddhi **CLICK**
  14. Siddhi is an open source streaming and complex event processing engine that we decided to use in our project. This helps us make real-time decisions based on predefined rules. **CLICK** It lets you perform SQL-like queries on streaming data This fits our use case completely because we want to be able to easily create and delete rules. Let’s look at some of the Siddhi SQL for the use cases that we discussed earlier: **CLICK** A Value Alert rule would look like this - where you select the fields in your input data that satisfy a certain condition. In this case, it is ticker and price value and you send the output to another dataStream. **CLICK** A latency rule would look like filtering out events that have exceeded a certain time threshold; in this case, 60 ms. **CLICK** Lastly, let’s take a look at one of our complex queries. A Siddhi SQL gap detection query would say: For every event you receive in the input stream, if you do not get an input event for 20 seconds, send an alert to the output stream. Siddhi also supports adding our own custom extensions, which means you can add your own functions, libraries, and custom logic, which are not provided by Siddhi out of the box. If your rule requires a complex logic like sorting of data in a given time window and then performing certain computations on it, you could write that as your own function and plug that in using the Siddhi extension.
  15. How do we integrate Siddhi in our system? We use Flink-Siddhi!! It is a library to run Siddhi CEP with Flink streaming applications It integrates Siddhi CEP as a stream operator and lets you perform operations like filter, aggregation, window, group by, etc. It is through Flink-Siddhi that we manage various operations on our rules like: Creation, Updates, Deletion Enabling and Disabling You can connect single or double Flink datastreams with Siddhi CEP It is Flink-Siddhi that broadcasts the rules that we saw in the previous slides. It performs the necessary operations before sending it to Siddhi. An important feature handled by this library is the integration of Siddhi’s Runtime State Management with the Flink state. It lets Siddhi CEP understand native type information of Flink datastreams - be it the input or output datastream by registering these data types with Siddhi’s Stream schema.
  16. With the knowledge that we have now, here is what our system looks like when Siddhi, Flink-Siddhi, Flink and input events are all put together. Let’s follow the color scheme here: All the pieces related to Flink are in purple, Flink-Siddhi in green, Siddhi in blue, and input/output events in orange Initially, when we start our Flink job, we initialize our topology, get the Siddhi execution environment where the matching of rules will be performed And we also register the necessary primitive extensions for Siddhi to be able to understand our data types. As you can see, the ruleStream has different kinds of rule events - Creating rule1, Updating it These data streams are then registered with Siddhi using the SiddhiOperator in the Flink-Siddhi library And this operator performs all the functions that we spoke about in the previous slide. We then progress into applying rules on the input events. This happens through the Flink-Siddhi library. It performs the necessary transformations on the data for Siddhi to understand it. It is here where it broadcasts the ruleDataStream and knows which outputDataStream the result should be sent to. It also manages the Siddhi state which is used by siddhiRuntimes to perform event processing. This is also managed during failover and recovery. The rest of the process remains the same, where you get the alerts in a dataStream and redirect it to the correct destination using AlertSink.
  17. Let’s look at an example of a gap detection alert and see how would it look in our system This is the Rule event coming in - which has a ruleId, SQL, and the destination When the first event enters the system, a window is started for a period of 20 seconds Since the Siddhi runtime does not see an event in that time window, it triggers an alert When the second event enters, a new window is started, but no alert is triggered since the third event arrived before the time window expires When the third event enters, a new window is started, which again triggers an alert This pattern continues further to detect late and missing events Now that we know of what our alerting system looks like, let’s look at the failover and recovery mechanism of our flink job.
  18. How do we handle checkpointing and savepointing? A checkpoint is a snapshot of the current state of the Flink application, which also includes the consumed event positions of the input. Flink recovers its application by loading the state from the checkpoint and continue from the event position. In this case, let’s look at how snapshotting works for our system. For simplicity’s sake, let’s consider that our Kafka topic has only one partition each When the KafkaSource reads events from these topics, it keeps incrementing the offsets If a checkpoint is triggered after consuming two rules from ruleStream and three events from InputStream, the Flink task will snapshot the state when these events are processed. When the Flink task receives the checkpoint barrier, it snapshots the Siddhi state as well for the consumed rules and communicates it to the Flink job master. This data gets written asynchronously, which means that Flink can continue to process the input events. We also checkpoint the job metadata which includes the execution plans and siddhi runtimes. When the flink task receives the checkpoint barrier, it snapshots the siddhiRuntimes and executionPlans created for the consumed rules and communicates it to flink job master
  19. These are some of the lessons we learnt along our way – and we are still learning In order to leverage the simplicity of Siddhi SQL for our dynamic rules, we decided to flatten our input events. So if our price field had two nested fields in it - e.g., amount and currency - after flattening, it would fetch one field each This is done though Siddhi SQL. The rules are then applied to this flattened data. This flattening logic added an extra operator for us. We decided to develop and maintain the Flink-Siddhi library in-house since it does not have active contributions. A recent example of maintaining it included upgrading our Flink and Siddhi versions. We also plan to contribute it back to open source. Since Siddhi supports extensions, we should be aware of its limitations and write our own custom extensions if needed. Flink-Siddhi creates a new runtime for every new rule that is sent to the system. We have observed that this increases our latency, CPU and memory usage. Since ours is a low latency system, we need to make trade-offs to support our use cases.
  20. This is what we had for our talk today. Thank you for listening to us. If you want to learn more about what we do, visit TechAtBloomberg.com; or if you want to join us, look for open roles on our Careers site. We will open it up for questions now.