This document discusses Flink's Table and SQL APIs, which provide a unified way to write batch and streaming queries. It motivates the need for a relational API by explaining that while Flink's DataStream API is powerful, it requires more technical skills. The Table and SQL APIs allow users to focus on business logic by writing declarative queries. It describes how the APIs work, including translating queries to logical and execution plans and supporting batch, streaming and windowed queries. Finally, it outlines the current capabilities and opportunities for contributors to help expand Flink's relational features.
Flink Forward SF 2017: Timo Walther - Table & SQL API – unified APIs for batch and stream processing
1. 1
Timo Walther
Apache Flink PMC
@twalthr
Flink Forward @ San Francisco - April 11th, 2017
Table & SQL API
unified APIs for batch and stream processing
3. DataStream API is great…
3
Very expressive stream processing
• Transform data, update state, define windows, aggregate, etc.
Highly customizable windowing logic
• Assigners, Triggers, Evictors, Lateness
Asynchronous I/O
• Improve communication to external systems
Low-level Operations
• ProcessFunction gives access to timestamps and timers
4. … but it is not for Everyone!
4
Writing DataStream programs is not always easy
• Stream processing technology spreads rapidly
• New streaming concepts (time, state, windows, ...)
Requires knowledge & skill
• Continous applications have special requirements
• Programming experience (Java / Scala)
Users want to focus on their business logic
5. Why not a Relational API?
5
Relational API is declarative
• User says what is needed, system decides how to compute it
Queries can be effectively optimized
• Less black-boxes, well-researched field
Queries are efficiently executed
• Let Flink handle state, time, and common mistakes
”Everybody” knows and uses SQL!
6. Goals
Easy, declarative, and concise relational API
Tool for a wide range of use cases
Relational API as a unifying layer
• Queries on batch tables terminate and produce a finite result
• Queries on streaming tables run continuously and produce result
stream
Same syntax & semantics for both queries
6
8. Table API & SQL
Flink features two relational APIs
• Table API: LINQ-style API for Java & Scala (since Flink 0.9.0)
• SQL: Standard SQL (since Flink 1.1.0)
8
DataSet API DataStream API
Table API
SQL
Flink Dataflow Runtime
9. Table API & SQL Example
9
val tEnv = TableEnvironment.getTableEnvironment(env)
// configure your data source
val customerSource = CsvTableSource.builder()
.path("/path/to/customer_data.csv")
.field("name", Types.STRING).field("prefs", Types.STRING)
.build()
// register as a table
tEnv.registerTableSource(”cust", customerSource)
// define your table program
val table = tEnv.scan("cust").select('name.lowerCase(), myParser('prefs))
val table = tEnv.sql("SELECT LOWER(name), myParser(prefs) FROM cust")
// convert
val ds: DataStream[Customer] = table.toDataStream[Customer]
10. Windowing in Table API
10
val sensorData: DataStream[(String, Long, Double)] = ???
// convert DataStream into Table
val sensorTable: Table = sensorData
.toTable(tableEnv, 'location, 'rowtime, 'tempF)
// define query on Table
val avgTempCTable: Table = sensorTable
.window(Tumble over 1.day on 'rowtime as 'w)
.groupBy('location, ’w)
.select('w.start as 'day,
'location,
(('tempF.avg - 32) * 0.556) as 'avgTempC)
.where('location like "room%")
11. Windowing in SQL
11
val sensorData: DataStream[(String, Long, Double)] = ???
// register DataStream
tableEnv.registerDataStream(
"sensorData", sensorData, 'location, 'rowtime, 'tempF)
// query registered Table
val avgTempCTable: Table = tableEnv.sql("""
SELECT TUMBLE_START(TUMBLE(time, INTERVAL '1' DAY) AS day,
location,
AVG((tempF - 32) * 0.556) AS avgTempC
FROM sensorData
WHERE location LIKE 'room%’
GROUP BY location, TUMBLE(time, INTERVAL '1' DAY)
""")
13. Architecture
13
DataSet Rules
DataSet PlanDataSet DataStreamDataStream Plan
DataStream Rules
Calcite Catalog
Calcite Logical Plan
Calcite Optimizer
Calcite
Parser & Validator
Table API SQL API
DataSet
Table
Sources
DataStream
Table API Validator
14. Architecture
14
DataSet Rules
DataSet PlanDataSet DataStreamDataStream Plan
DataStream Rules
Calcite Catalog
Calcite Logical Plan
Calcite Optimizer
Calcite
Parser & Validator
Table API SQL API
DataSet
Table
Sources
DataStream
Table API Validator
15. Architecture
15
DataSet Rules
DataSet PlanDataSet DataStreamDataStream Plan
DataStream Rules
Calcite Catalog
Calcite Logical Plan
Calcite Optimizer
Calcite
Parser & Validator
Table API SQL API
DataSet
Table
Sources
DataStream
Table API Validator
16. Architecture
16
DataSet Rules
DataSet PlanDataSet DataStreamDataStream Plan
DataStream Rules
Calcite Catalog
Calcite Logical Plan
Calcite Optimizer
Calcite
Parser & Validator
Table API SQL API
DataSet
Table
Sources
DataStream
Table API Validator
17. Translation to Logical Plan
17
sensorTable
.window(Tumble over 1.day on 'rowtime as 'w)
.groupBy('location, ’w)
.select(
'w.start as 'day,
'location,
(('tempF.avg - 32) *
0.556) as 'avgTempC)
.where('location like "room%")
Catalog Node
Window Aggregate
Project
Filter
Logical Table Scan
Logical Window
Aggregate
Logical Project
Logical Filter
Table Nodes Calcite Logical Plan
Table API Validation
Translation
18. Translation to DataStream Plan
18
Logical Table Scan
Logical Window
Aggregate
Logical Project
Logical Filter
Calcite Logical Plan
Logical Table Scan
Logical Window
Aggregate
Logical Calc
Optimized Plan
DataStream Scan
DataStream Calc
DataStream
Aggregate
DataStream Plan
Optimize
Transform
19. Translation to Flink Program
19
DataStream Scan
DataStream Calc
DataStream
Aggregate
DataStream Plan
(Forwarding)
FlatMap Function
Aggregate & Window
Function
DataStream Program
Translate &
Code-generate
20. Current State (in master)
Batch support
• Selection, Projection, Sort, Inner & Outer Joins, Set operations
• Group-Windows for Slide, Tumble, Session
Streaming support
• Selection, Projection, Union
• Group-Windows for Slide, Tumble, Session
• Different SQL OVER-Windows (RANGE/ROWS)
UDFs, UDTFs, custom rules
20
21. Use Cases for Streaming SQL
Continuous ETL & Data Import
Live Dashboards & Reports
21
23. Dynamic Tables Model
Dynamic tables change over time
Dynamic tables are treated like static batch tables
• Dynamic tables are queried with standard SQL / Table API
• Every query returns another Dynamic Table
“Stream / Table Duality”
• Stream ←→ Dynamic Table
conversions without information loss
23
25. Querying Dynamic Tables
Dynamic tables change over time
• A[t]: Table A at specific point in time t
Dynamic tables are queried with relational semantics
• Result of a query changes as input table changes
• q(A[t]): Evaluate query q on table A at time t
Query result is continuously updated as t progresses
• Similar to maintaining a materialized view
• t is current event time
25
28. Querying a Dynamic Table
Can we run any query on Dynamic Tables? No!
State may not grow infinitely as more data arrives
• Set clean-up timeout or key constraints.
Input may only trigger partial re-computation
Queries with possibly unbounded state or computation
are rejected
28
29. Dynamic Table to Stream
Convert Dynamic Table modifications into stream
messages
Similar to database logging techniques
• Undo: previous value of a modified element
• Redo: new value of a modified element
• Undo+Redo: old and the new value of a changed element
For Dynamic Tables: Redo or Undo+Redo
29
30. Dynamic Table to Stream
Undo+Redo Stream (because A is in Append Mode):
30
31. Dynamic Table to Stream
Redo Stream (because A is in Update Mode):
31
32. Result computation & refinement
32
First result
(end – x)
Last result
(end + x)
State is purged.
Late updates
(on new data)
Update rate
(every x)
Complete
result
(end + x)
Complete result can be computed
(end)
33. Contributions welcome!
Huge interest and many contributors
• Adding more window operators
• Introducing dynamic tables
And there is a lot more to do
• New operators and features for streaming and batch
• Performance improvements
• Tooling and integration
Try it out, give feedback, and start contributing!
33
DATASTREAM:event-time semantics, stateful exactly-once processing, high throughput & low latency at the same time
compute exact and deterministic results in real-time
ASYNC: enrich stream events with data stored in a database, communication delay with external system does not dominate the streaming application’s total work
There is a talent gap.
SKILL:Memory-bound
Handling of timestamps and watermarks in ProcessFunctions
API which quickly solves 80% of their use cases where simple tasks can be defined using little code.
BUSINESS:
No null support, no timestamps, no common tools for string normalization etc.
Users do not specify implementation.
UDF: great for expressiveness, bad for optimization - need for manual tuning
ProcessFunctions implemented in Flink handle state and time.
SQL is the most widely used language for data analytics
SQL would make stream processing much more accessible
Flink is a platform for distributed stream and batch data processing
users only need to learn a single API
a query produces exactly the same result regardless whether its input is static batch data or streaming data
TABLE API:
- Language INtegrated Query (LINQ) API- Queries are not embedded as String- Centered around Table objects- Operations are applied on Tables and return a Table
SQL:- Standard SQL- Queries are embedded as Strings into programs- Referenced tables must be registered- Queries return a Table object- Integration with Table API
Equivalent feature set (at the moment)
Table API and SQL can be mixed
Both are tightly integrated with Flink’s core API
often referred to as the Table API
Works for both batch and stream.
Works for both batch and stream.
Maintain standard SQL
Compliant
Apache Calcite is a SQL parsing and query optimizer framework
Used by many other projects to parse and optimize SQL queriesApache Drill, Apache Hive, Apache Kylin, Cascading, …
Tables, columns, and data types are stored in a central catalog
Tables can be created from
DataSets
DataStreams
TableSources (without going through DataSet/DataStream API)
Table API and SQL queries are translated into common logical plan representation.
Logical plans are translated and optimized depending on execution backend.
Plans are transformed into DataSet or DataStream programs.
API calls are translated into logical operators and immediately validated
API operators compose a treeBefore optimization, the API operator tree is translated into a logical Calcite plan
Calcite provides many optimization rules
Custom rules to transform logical nodes into Flink nodes
DataSet rules to translate batch queries
DataStream rules to translate streaming queries
Constant expression reduction
into DataStream or DataSet operators
With continous queries in mind!
Janino Compiler Framework
Arriving data is incrementally aggregated using the given aggregate function. This means that the window function typically has only a single value to process when called.
RANGE UNBOUNDED preceding
ROWS BETWEEN 5 preceding AND CURRENT ROW
RANGE BETWEEN INTERVAL '1' SECOND preceding AND CURRENT ROW
BUT:
relational processing of batch is well defined and understood
not all relational operators can be naively applied on streams
no widely accepted definition of the semantics for relational processing of streaming data
all supported operators have in common that they never update result records which have been emitted
not an issue for record-at-a-time operators such as projection and filter
affects operators that collect and process multiple records: emitted results cannot be updated, late event are discarded
emit data to log-style system where emitted data cannot be updated (results cannot be refined)
only append operations, no updates or delete
many streaming analytics use cases that need to update results
no discarding possible
early results neededcan be updated and refinedanalyze and explore streaming data in a real-time fashion
vastly increase the scope of the APIs and the range of supported use cases
can be challenging to implement using the DataStream API
It is important to note that this is only the logical model and does not imply how the query is actually executed
RETURNS:
runs continuously and produces a table that is continuously updated -> Dynamic Table
result-updating queries
But: we must of course preserve the unified semantics for stream and batch inputs
we have to specify how the records of a stream modify the dynamic table
APPEND:each stream record is an insert modification
conceptually the dynamic table is ever-growing and infinite in size
UPDATE/REPLACE:
specify a unique key attribute
stream record can represent an insert, update, or delete modification
append mode is in fact a special case of update mode
Let’s imagine we take a snapshot of a dynamic table at a specific point in time. Snapshot is like a regular static batch table.
PROGRESS:
If we repeatedly compute the result of a query on snapshots for progressing points in time, we obtain many static result tables. They are changing over time and effectively constitute a dynamic table.
At each point in time t, the result table is equivalent to a batch query on the dynamic table A at time t.
APPENDMODE:
query continuously updates result rows that it had previously emitted instead of only adding new rows
size of the result table depends on the number of distinct grouping keys of the input table
NOTE:
As long as it is not emitted these changes are completely internal and not visible to a user.changes materialize when a dynamic table is emitted
In contrast to the result of the first example, the resulting table grows relative to the time.
While the non-windowed query (mostly) updates rows of the result table, the windowed aggregation query only appends new rows to the result table.
only those that can be continuously, incrementally, and efficiently computed
Traditional database systems use logs to rebuild tables in case of failures and for replication.
UNDO logs contain the previous value of a modified element to revert incomplete transactions
REDO logs contain the new value of a modified element to redo lost changes of completed transactions
UNDO/REDO logs contain the old and the new value of a changed element
An insert modification is emitted as an insert message with the new row,
a delete modification is emitted as a delete message with the old row,
and an update modification is emitted as a delete message with the old row and an insert message with the new row
Current processing model is a subset of the dynamic table model.
downstream operators or data sinks need to be able to correctly handle both types of messages
only tables with unique keys can have update and delete modifications.
the downstream operators need to be able to access previous values by key.
val outputOpts = OutputOptions()
.firstResult(-15.minutes) // first result 15 mins early
.completeResult(+5.minutes) // complete result 5 mins late
.updateRate(3.minutes) // result is updated every 3 mins
.lateUpdates(true) // late result updates enabled
.lastResult(+15.minutes) // last result 15 mins late -> until state is purged
More TableSource and SinksGenerated intermediate data types, serializers, comparatorsStandalone SQL client
Code generated aggregate functions