The document describes FLOW, an abstraction layer that allows domain experts to develop Apache Flink streaming applications without needing expertise in Flink's APIs. FLOW provides a graphical user interface where users can build streaming data pipelines visually using common SQL operations and connectors. When users save their pipelines in FLOW, it generates the underlying Flink code. This allows domain experts across various fields to directly develop real-time stream processing solutions with Flink without involving data engineers to bridge the gap in knowledge.
4. How we used to work
FlinkForward 2017
Predictive Maintenance
with Flink .
FlinkForward 2018
Real-time driving score
service using Flink
Domain experts Target systems Requirement
Refinery engineers Expensive refinery equipment
Driving score service
Generating alarms ASAP (in real time)
Generating scores ASAP (in real time)Mobility service planners
5. Needs for real-time stream processing
FlinkForward 2017
Predictive Maintenance
with Flink .
FlinkForward 2018
Real-time driving score
service using Flink
Apache Flink
SQL/Table API
DataStream API
Process Function
dynamic tables
streams, windows
events, state, time
Domain experts
Refinery engineers
Mobility service planners
How to process
real-time stream data?
6. Nonnegligible distance between domain experts and Flink
FlinkForward 2017
Predictive Maintenance
with Flink .
FlinkForward 2018
Real-time driving score
service using Flink
Apache Flink
SQL/Table API
DataStream API
Process Function
dynamic tables
streams, windows
events, state, time
JVM languages IDE
Project management
Data visualization
54.6 million
kilometers
384,400
kilometers
Domain experts
Refinery engineers
Mobility service planners
7. Data engineers bridge the gap
FlinkForward 2017
Predictive Maintenance
with Flink .
FlinkForward 2018
Real-time driving score
service using Flink
Apache Flink
SQL/Table API
DataStream API
Process Function
dynamic tables
streams, windows
events, state, time
JVM languages IDE
Project management
Data visualization
Data engineersDomain experts
Refinery engineers
Mobility service planners
8. Data engineers bridge the gap
FlinkForward 2017
Predictive Maintenance
with Flink .
FlinkForward 2018
Real-time driving score
service using Flink
Apache Flink
SQL/Table API
DataStream API
Process Function
dynamic tables
streams, windows
events, state, time
JVM languages IDE
Project management
Data visualization
Data engineersDomain experts
Refinery engineers
Mobility service planners
Data
Domain knowledge
Requirement
Data
Domain knowledge
Requirement
Training & Transfer
Maintenance
Maintenance
Training & Transfer
Very inefficient!
9. Let domain experts do Flink directly via FLOW
Apache Flink
SQL/Table API
DataStream API
Process Function
dynamic tables
streams, windows
events, state, time
JVM languages IDE
Project management
Data visualization
Various domains of SK
Mobility
IoTTelco
Energy Semiconductor
E-commerce Media
10. FLOW
Abstraction layer to hide
the details of Flink app. development.
Let domain experts do Flink directly via FLOW
Apache Flink
SQL/Table API
DataStream API
Process Function
dynamic tables
streams, windows
events, state, time
JVM languages IDE
Project management
Data visualization
Various domains of SK
Mobility
IoTTelco
Energy Semiconductor
E-commerce Media
Graphical user interface to build
stream processing pipelines of
SQL operations and connectors
Data
Domain knowledge
Requirement
SQL!
SQL!
SQL!
SQL!
11. T A B L E O F C O N T E N T S
Do Flink onWeb with FLOW
1. Motivation
2. Demo – identifying popular places with FLOW
3. Architecture
4. Supported operations & connectors
5. Summary
13. Demo Scenario: Identifying popular places in NYC
Kafka
16 110 6 313 2457891214151718 11
Removing events that do not st
art or end in NewYork City
14. 16 110 3 257891214151718 11
Demo Scenario: Identifying popular places in NYC
Kafka
100m
100m
Mapping coordinates of each re
cord into a grid cell
16 110 3 257891214151718 11
15. Demo Scenario: Identifying popular places in NYC
Kafka
Creating a sliding window of size 10 minutes
that slides by 5 minutes
16 110 3 257891214151718 11
16. Demo Scenario: Identifying popular places in NYC
Kafka
100m
100m
1
2
10
Counting the number of events
in each grid cell per window
16 110 3 257891214151718 11
17. 100m
100m
Demo Scenario: Identifying popular places in NYC
Kafka
1Identifying the cells whose count is
10 or more as a popular place
2
10
16 110 3 257891214151718 11
58. T A B L E O F C O N T E N T S
Do Flink onWeb with FLOW
1. Motivation
2. Demo
3. Architecture – how FLOW interacts with Flink
4. Supported operations & connectors
5. Summary
59. Overall architecture of FLOW
RESTfulWeb Server
by Spring
Frontend Interface
by Angular.js
Spec.
Spec. Schema
Preview
expected to be returned
as response
Preview
Schema
not computed yet
Kafka
source
60. Overall architecture of FLOW
RESTfulWeb Server
by Spring
Frontend Interface
by Angular.js
Preview
Schema
Spec.
Spec. Schema
Preview
Schema
Preview
Kafka
source
Spec.
Preview Loaders
KafkaPreview
Loader
Kafka
consumer
Kafka
source
Spec.
IN
Schema
Preview
OUT
Kafka
source
FlinkPreview
LoaderParent
Schema
Preview
Child
Spec.
IN
Local Flink minicluster
Schema
Preview
OUT
61. Overall architecture of FLOW
RESTfulWeb Server
by Spring
Frontend Interface
by Angular.js
Preview
Schema
Spec.
Kafka
source
Preview Loaders
Spec.
Kafka
source
Schema
Preview
Filter
Spec.
Spec.
but also parent's
schema&preview
KafkaPreview
Loader
Kafka
consumer
Kafka
source
Spec.
IN
Schema
Preview
OUT
FlinkPreview
LoaderParent
Schema
Preview
Child
Spec.
IN
Local Flink minicluster
Schema
Preview
OUT
Preview
Schema
not computed yet
Filter
not only spec
62. LocalStreamEnvironment
(env)
FlinkPreviewLoader
StreamTableEnvironment
(tEnv)
val parentTable = tEnv.registerTable( )
Kafka
source
Schema
Preview
// register all known UDF instances
tEnv.registerFunction("isInNYC", new GeoUtils.IsInNYC())
tEnv.registerFunction("toCellId", new GeoUtils.toCellId())
tEnv.registerFunction("toCoords", new GeoUtils.ToCoords())
www.halloweencostumes.com/adult-piggyback-ride-on-costume.html
FLOW piggybacks on Flink
for schema & preview
computation
val env = StreamExecutionEnvironment.createLocalStreamEnvironment()
val tEnv = StreamTableEnvironment.create(env)FlinkPreview
Loader
Local Flink mini cluster
Schema
Preview
OUT
val table = parentTable.filter("isInNYC(startLon, startLat)&&isInNYC(startLon, startLat)")
Filter
Spec.
// to get the result in this thread
tEnv.toAppendStream(table, Row.class)
.addSink(new CollectSink())
env.execute()
// return and
return (CollectSink.values, table.getSchema())
SchemaPreview
Kafka
source
Schema
Preview
Filter
Spec.
IN
Tables
Functions
parentTable table
isInNYC: (float, float) → boolean
toCellId: (float, float) → int
toCoord: int → (float, float)
63. Kafka
source
Schema
Preview
Schema
Preview
Filter
Overall architecture of FLOW
RESTfulWeb Server
by Spring
Frontend Interface
by Angular.js
Preview
Schema
Spec.
Kafka
source
Preview Loaders
Spec.
Kafka
source
Schema
Preview
Filter
Spec.
Schema
Preview
Preview
Schema
Spec.
Filter
KafkaPreview
Loader
Kafka
consumer
Kafka
source
Spec.
IN
Schema
Preview
OUT
FlinkPreview
LoaderParent
Schema
Preview
Child
Spec.
IN
Local Flink minicluster
Schema
Preview
OUT
64. Overall architecture of FLOW
RESTfulWeb Server
by Spring
Preview
Schema
Spec.
Kafka
source
Preview
Schema
Spec.
Filter
Preview
Schema
Spec.
Select
Preview
Schema
Spec.
Window
Preview
Schema
Spec.
SQL
query
Preview Loaders
Spec.
Parent
Schema
Preview
Child
Spec.
Schema
Preview
Frontend Interface
by Angular.js
KafkaPreview
Loader
Kafka
consumer
Kafka
source
Spec.
IN
Schema
Preview
OUT
FlinkPreview
LoaderParent
Schema
Preview
Child
Spec.
IN
Local Flink minicluster
Schema
Preview
OUT
Parent
Schema
Preview
Schema
Preview
Child
65. Overall architecture of FLOW
RESTfulWeb Server
by Spring
Frontend Interface
by Angular.js
Preview
Schema
Spec.
Kafka
source
Preview
Schema
Spec.
Filter
Preview
Schema
Spec.
Select
Preview
Schema
Spec.
Window
Preview
Schema
Spec.
SQL
query
Preview
Schema
Spec.
ES Sink
Project Generator
by javapoet/freemarker Schema
Spec.
Kafka
source
Schema
Spec.
Filter
Schema
Spec.
Select
Schema
Spec.
Window
Schema
Spec.
SQL
query
Schema
Spec.
ES Sink
Maven project
Flink application
66. T A B L E O F C O N T E N T S
Do Flink onWeb with FLOW
1. Motivation
2. Demo
3. Architecture
4. Supported operations & connectors
5. Summary
We also support Temporal JOIN operation.
First you configure the temporal table,
then the append-only table,
then finally list up expressions like you did in the select operation