SlideShare uma empresa Scribd logo
1 de 22
FEBRUARY9, 2017, WARSAW
Stream Analytics with SQL on Apache Flink®
Fabian Hueske | Apache Flink PMC member | Co-founder dataArtisans
FEBRUARY9, 2017, WARSAW
Streams are Everywhere
FEBRUARY9, 2017, WARSAW
Data Analytics on Streaming Data
• Periodic batch processing
• Lots of duct tape and baling wire
• It’s up to you to make
everything work… reliably!
• High latency
• Continuous stream processing
• Framework takes care of failures
• Low latency
FEBRUARY9, 2017, WARSAW
Stream Processing in Apache Flink
• Platform for scalable stream processing
• Fast
• Low latency and high throughput
• Accurate
• Stateful streaming processing in event time
• Reliable
• Exactly-once state guarantees
• Highly available cluster setup
FEBRUARY9, 2017, WARSAW
Streaming Applications Powered by Flink
30 Flink applications in production for more than
one year. 10 billion events (2TB) processed daily
Complex jobs of > 30 operators running 24/7,
processing 30 billion events daily, maintaining
state of 100s of GB with exactly-once guarantees
Largest job has > 20 operators, runs on > 5000
vCores in 1000-node cluster, processes millions of
events per second
FEBRUARY9, 2017, WARSAW
Stream Processing is not for Everybody, … yet
• APIs of open source stream processors target developers
• Implementing streaming applications requires knowledge & skill
• Stream processing concepts (time, state, windows, triggers, ...)
• Programming experience (Java / Scala APIs)
• Stream processing technology spreads rapidly
• There is a talent gap
FEBRUARY9, 2017, WARSAW
What about SQL?
• SQL is the most widely used language for data analytics
• Many good reasons to use SQL
• Declarative specification
• Optimization
• Efficient execution
• “Everybody” knows SQL
• SQL would make stream processing much more accessible, but…
FEBRUARY9, 2017, WARSAW
No OS Stream Processor Offers Decent SQL Support
• SQL was not designed with streaming data in mind
• Relations are sets. Streams are infinite sequences.
• Records arrive over time.
• Syntax
• Time-based operations are cumbersome to specify (aggregates, joins)
• Semantics
• A SQL query should compute the same result on a batch table and a stream
FEBRUARY9, 2017, WARSAW
• Standard SQL and LINQ-style Table API
• Unified APIs for batch & streaming data
• Common translation layers
• Optimization based on Apache Calcite
• Type system & code-generation
• Table sources & sinks
• Streaming SQL & Table API is work in
progress
Flink’s SQL Support & Table API
FEBRUARY9, 2017, WARSAW
What are the Use Cases for Stream SQL?
• Continuous ETL & Data Import
• Live Dashboards & Reports
• Ad-hoc Analytics & Exploration
FEBRUARY9, 2017, WARSAW
Dynamic Tables
• Core concept is a “Dynamic Table”
• Dynamic tables change over time
• Dynamic tables are treated like static batch tables
• Dynamic tables are queried with standard SQL
• A query returns another dynamic table
• Stream ←→ Dynamic Table conversions without information loss
• “Stream / Table Duality”
FEBRUARY9, 2017, WARSAW
Stream → Dynamic Table
• Append
• Replace by Key
time k
1 A
2 B
4 A
5 C
7 B
8 A
9 B
… …
time k
2, B4, A5, C7, B8, A9, B 1, A
2, B4, A5, C7, B8, A9, B 1, A
8 A
9 B
5 C
… …
FEBRUARY9, 2017, WARSAW
Querying a Dynamic Table
• Dynamic tables change over time
• A[t]: Table A at time t
• Dynamic tables are queried with regular SQL
• Result of a query changes as input table changes
• q(A[t]): Evaluate query q on table A at time t
• As time t progresses, the query result is continuously updated
• similar to maintaining a materialized view
• t is current event time
FEBRUARY9, 2017, WARSAW
Querying a Dynamic Table
time k
k cnt
A 3
B 2
C 1
9 B
k cnt
A 3
B 3
C 1
12 C
k cnt
A 3
B 3
C 2
A[8]
A[9]
A[12]
q(A[8])
q(A[9])
q(A[12])
Table A
q:
SELECT
k,
COUNT(k) as cnt
FROM A
GROUP BY k
1 A
2 B
4 A
5 C
7 B
8 A
FEBRUARY9, 2017, WARSAW
time k
A[5]
A[10]
A[15]
q(A[5])
q(A[10])
q(A[15])
Table A
Querying a Dynamic Table
7 B
8 A
9 B
11 A
12 C
14 C
15 A
k cnt endT
A 2 5
B 1 5
C 1 5
q(A)
A 1 10
B 2 10
A 2 15
C 2 15
q:
SELECT
k,
COUNT(k) AS cnt,
TUMBLE_END(
time,
INTERVAL '5' SECONDS)
AS endT
FROM A
GROUP BY
k,
TUMBLE(
time,
INTERVAL '5' SECONDS)
1 A
2 B
4 A
5 C
FEBRUARY9, 2017, WARSAW
Can We Run Any Query on Dynamic Tables?
• No 
• There are state and computation constraints
• State may not grow infinitely as more data arrives
• Clean-up timeout must be defined
• Input updates may only trigger partial re-computation of the result
• Queries with possibly unbounded state or computation are rejected
• Optimizer performs validation
FEBRUARY9, 2017, WARSAW
Bounding the State of a Query
• State grows infinitely with domain of grouping attribute
• Bound query input by time
• Query aggregates data of last 24 hours. Older data is discarded.
SELECT k, COUNT(k) AS cnt
FROM A
GROUP BY k
SELECT k, COUNT(k) AS cnt
FROM A
WHERE last(time, INTERVAL ‘1’ DAY)
GROUP BY k
STOP!
UNBOUNED
STATE!
FEBRUARY9, 2017, WARSAW
Updating Results and Late Arriving Data
• Sometimes emitted results need to be updated
• Results which are continuously updated
• Results for which relevant records arrived late
• Results that might be updated must be kept as state
• Clean-up timeout
• When a table is converted into a stream, updates must be propagated
• Update mode
• Add/Retract mode
FEBRUARY9, 2017, WARSAW
Dynamic Table → Stream: Update Mode
time k
Table A
B, 1A, 2C, 1B, 2A, 3 A, 1
SELECT
k,
COUNT(k) AS cnt
FROM A
GROUP BY k
1 A
2 B
4 A
5 C
7 B
8 A
… …
Update by Key
FEBRUARY9, 2017, WARSAW
Dynamic Table → Stream: Add/Retract Mode
time k
Table A
+ B, 1+ A, 2+ C, 1+ B, 2+ A, 3 + A, 1- A, 1- B, 1- A, 2
1 A
2 B
4 A
5 C
7 B
8 A
… …
SELECT
k,
COUNT(k) AS cnt
FROM A
GROUP BY k
Add (+) / Retract (-)
FEBRUARY9, 2017, WARSAW
Current State of SQL and Table API
• Huge interest and many contributors
• Current development efforts
• Adding more window operators
• Introducing dynamic tables
• And there is a lot more to do
• New operators and features for streaming and batch
• Performance improvements
• Tooling and integration
• Try it out, give feedback, and start contributing!
FEBRUARY9, 2017, WARSAW
Stream Analytics with SQL on Apache Flink
Fabian Hueske | @fhueske

Mais conteúdo relacionado

Mais procurados

Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...
Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...
Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...
Flink Forward
 
Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...
Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...
Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...
Flink Forward
 

Mais procurados (20)

Flink Forward SF 2017: Chinmay Soman - Real Time Analytics in the real World ...
Flink Forward SF 2017: Chinmay Soman - Real Time Analytics in the real World ...Flink Forward SF 2017: Chinmay Soman - Real Time Analytics in the real World ...
Flink Forward SF 2017: Chinmay Soman - Real Time Analytics in the real World ...
 
Flink Forward SF 2017: Stefan Richter - Improvements for large state and reco...
Flink Forward SF 2017: Stefan Richter - Improvements for large state and reco...Flink Forward SF 2017: Stefan Richter - Improvements for large state and reco...
Flink Forward SF 2017: Stefan Richter - Improvements for large state and reco...
 
Flink Forward Berlin 2017: Francesco Versaci - Integrating Flink and Kafka in...
Flink Forward Berlin 2017: Francesco Versaci - Integrating Flink and Kafka in...Flink Forward Berlin 2017: Francesco Versaci - Integrating Flink and Kafka in...
Flink Forward Berlin 2017: Francesco Versaci - Integrating Flink and Kafka in...
 
Flink Forward Berlin 2017: Stephan Ewen - The State of Flink and how to adopt...
Flink Forward Berlin 2017: Stephan Ewen - The State of Flink and how to adopt...Flink Forward Berlin 2017: Stephan Ewen - The State of Flink and how to adopt...
Flink Forward Berlin 2017: Stephan Ewen - The State of Flink and how to adopt...
 
Apache Flink's Table & SQL API - unified APIs for batch and stream processing
Apache Flink's Table & SQL API - unified APIs for batch and stream processingApache Flink's Table & SQL API - unified APIs for batch and stream processing
Apache Flink's Table & SQL API - unified APIs for batch and stream processing
 
Flink Forward SF 2017: Joe Olson - Using Flink and Queryable State to Buffer ...
Flink Forward SF 2017: Joe Olson - Using Flink and Queryable State to Buffer ...Flink Forward SF 2017: Joe Olson - Using Flink and Queryable State to Buffer ...
Flink Forward SF 2017: Joe Olson - Using Flink and Queryable State to Buffer ...
 
Flink Forward Berlin 2017: Jörg Schad, Till Rohrmann - Apache Flink meets Apa...
Flink Forward Berlin 2017: Jörg Schad, Till Rohrmann - Apache Flink meets Apa...Flink Forward Berlin 2017: Jörg Schad, Till Rohrmann - Apache Flink meets Apa...
Flink Forward Berlin 2017: Jörg Schad, Till Rohrmann - Apache Flink meets Apa...
 
Flink Forward SF 2017: David Hardwick, Sean Hester & David Brelloch - Dynami...
Flink Forward SF 2017: David Hardwick, Sean Hester & David Brelloch -  Dynami...Flink Forward SF 2017: David Hardwick, Sean Hester & David Brelloch -  Dynami...
Flink Forward SF 2017: David Hardwick, Sean Hester & David Brelloch - Dynami...
 
Flink Forward SF 2017: Kenneth Knowles - Back to Sessions overview
Flink Forward SF 2017: Kenneth Knowles - Back to Sessions overviewFlink Forward SF 2017: Kenneth Knowles - Back to Sessions overview
Flink Forward SF 2017: Kenneth Knowles - Back to Sessions overview
 
Flink forward SF 2017: Elizabeth K. Joseph and Ravi Yadav - Flink meet DC/OS ...
Flink forward SF 2017: Elizabeth K. Joseph and Ravi Yadav - Flink meet DC/OS ...Flink forward SF 2017: Elizabeth K. Joseph and Ravi Yadav - Flink meet DC/OS ...
Flink forward SF 2017: Elizabeth K. Joseph and Ravi Yadav - Flink meet DC/OS ...
 
What's new in 1.9.0 blink planner - Kurt Young, Alibaba
What's new in 1.9.0 blink planner - Kurt Young, AlibabaWhat's new in 1.9.0 blink planner - Kurt Young, Alibaba
What's new in 1.9.0 blink planner - Kurt Young, Alibaba
 
Flink Forward SF 2017: Shaoxuan Wang_Xiaowei Jiang - Blinks Improvements to F...
Flink Forward SF 2017: Shaoxuan Wang_Xiaowei Jiang - Blinks Improvements to F...Flink Forward SF 2017: Shaoxuan Wang_Xiaowei Jiang - Blinks Improvements to F...
Flink Forward SF 2017: Shaoxuan Wang_Xiaowei Jiang - Blinks Improvements to F...
 
Flink Forward Berlin 2017: Mihail Vieru - A Materialization Engine for Data I...
Flink Forward Berlin 2017: Mihail Vieru - A Materialization Engine for Data I...Flink Forward Berlin 2017: Mihail Vieru - A Materialization Engine for Data I...
Flink Forward Berlin 2017: Mihail Vieru - A Materialization Engine for Data I...
 
Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...
Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...
Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...
 
Keynote: Stephan Ewen - Stream Processing as a Foundational Paradigm and Apac...
Keynote: Stephan Ewen - Stream Processing as a Foundational Paradigm and Apac...Keynote: Stephan Ewen - Stream Processing as a Foundational Paradigm and Apac...
Keynote: Stephan Ewen - Stream Processing as a Foundational Paradigm and Apac...
 
I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...
I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...
I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...
 
Flink Forward SF 2017: Feng Wang & Zhijiang Wang - Runtime Improvements in Bl...
Flink Forward SF 2017: Feng Wang & Zhijiang Wang - Runtime Improvements in Bl...Flink Forward SF 2017: Feng Wang & Zhijiang Wang - Runtime Improvements in Bl...
Flink Forward SF 2017: Feng Wang & Zhijiang Wang - Runtime Improvements in Bl...
 
Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...
Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...
Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...
 
Stateful Distributed Stream Processing
Stateful Distributed Stream ProcessingStateful Distributed Stream Processing
Stateful Distributed Stream Processing
 
Flink Forward Berlin 2017: Till Rohrmann - From Apache Flink 1.3 to 1.4
Flink Forward Berlin 2017: Till Rohrmann - From Apache Flink 1.3 to 1.4Flink Forward Berlin 2017: Till Rohrmann - From Apache Flink 1.3 to 1.4
Flink Forward Berlin 2017: Till Rohrmann - From Apache Flink 1.3 to 1.4
 

Destaque

Introducing Kafka's Streams API
Introducing Kafka's Streams APIIntroducing Kafka's Streams API
Introducing Kafka's Streams API
confluent
 

Destaque (20)

Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
 
How Humans See Data - Amazon Cut
How Humans See Data - Amazon CutHow Humans See Data - Amazon Cut
How Humans See Data - Amazon Cut
 
Scalable Python with Docker, Kubernetes, OpenShift
Scalable Python with Docker, Kubernetes, OpenShiftScalable Python with Docker, Kubernetes, OpenShift
Scalable Python with Docker, Kubernetes, OpenShift
 
Taking a look under the hood of Apache Flink's relational APIs.
Taking a look under the hood of Apache Flink's relational APIs.Taking a look under the hood of Apache Flink's relational APIs.
Taking a look under the hood of Apache Flink's relational APIs.
 
Beacosystem Talk @ MongoDB User Group Dublin @sos100
Beacosystem Talk @ MongoDB User Group Dublin @sos100Beacosystem Talk @ MongoDB User Group Dublin @sos100
Beacosystem Talk @ MongoDB User Group Dublin @sos100
 
Dynamic Scaling: How Apache Flink Adapts to Changing Workloads (at FlinkForwa...
Dynamic Scaling: How Apache Flink Adapts to Changing Workloads (at FlinkForwa...Dynamic Scaling: How Apache Flink Adapts to Changing Workloads (at FlinkForwa...
Dynamic Scaling: How Apache Flink Adapts to Changing Workloads (at FlinkForwa...
 
Streaming analytics better than batch when and why - (Big Data Tech 2017)
Streaming analytics better than batch   when and why - (Big Data Tech 2017)Streaming analytics better than batch   when and why - (Big Data Tech 2017)
Streaming analytics better than batch when and why - (Big Data Tech 2017)
 
Big Data Day LA 2015 - Always-on Ingestion for Data at Scale by Arvind Prabha...
Big Data Day LA 2015 - Always-on Ingestion for Data at Scale by Arvind Prabha...Big Data Day LA 2015 - Always-on Ingestion for Data at Scale by Arvind Prabha...
Big Data Day LA 2015 - Always-on Ingestion for Data at Scale by Arvind Prabha...
 
IoT Innovation Lab Berlin @relayr - Kay Lerch on Getting basics right for you...
IoT Innovation Lab Berlin @relayr - Kay Lerch on Getting basics right for you...IoT Innovation Lab Berlin @relayr - Kay Lerch on Getting basics right for you...
IoT Innovation Lab Berlin @relayr - Kay Lerch on Getting basics right for you...
 
The Data Dichotomy- Rethinking the Way We Treat Data and Services
The Data Dichotomy- Rethinking the Way We Treat Data and ServicesThe Data Dichotomy- Rethinking the Way We Treat Data and Services
The Data Dichotomy- Rethinking the Way We Treat Data and Services
 
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San JoseDataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
 
Comparison of various streaming technologies
Comparison of various streaming technologiesComparison of various streaming technologies
Comparison of various streaming technologies
 
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache ApexFebruary 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
 
Real-Time Event & Stream Processing on MS Azure
Real-Time Event & Stream Processing on MS AzureReal-Time Event & Stream Processing on MS Azure
Real-Time Event & Stream Processing on MS Azure
 
Introducing Kafka's Streams API
Introducing Kafka's Streams APIIntroducing Kafka's Streams API
Introducing Kafka's Streams API
 
Developing Connected Applications with AWS IoT - Technical 301
Developing Connected Applications with AWS IoT - Technical 301Developing Connected Applications with AWS IoT - Technical 301
Developing Connected Applications with AWS IoT - Technical 301
 
Lightbend Fast Data Platform
Lightbend Fast Data PlatformLightbend Fast Data Platform
Lightbend Fast Data Platform
 
How to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of ThingsHow to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of Things
 
Power of the Log: LSM & Append Only Data Structures
Power of the Log: LSM & Append Only Data StructuresPower of the Log: LSM & Append Only Data Structures
Power of the Log: LSM & Append Only Data Structures
 
Study: The Future of VR, AR and Self-Driving Cars
Study: The Future of VR, AR and Self-Driving CarsStudy: The Future of VR, AR and Self-Driving Cars
Study: The Future of VR, AR and Self-Driving Cars
 

Semelhante a Fabian Hueske - Stream Analytics with SQL on Apache Flink

Stream Analytics with SQL on Apache Flink - Fabian Hueske
Stream Analytics with SQL on Apache Flink - Fabian HueskeStream Analytics with SQL on Apache Flink - Fabian Hueske
Stream Analytics with SQL on Apache Flink - Fabian Hueske
Evention
 
Project_Plan-Datalake_v1.0_26-10-2022.pptx
Project_Plan-Datalake_v1.0_26-10-2022.pptxProject_Plan-Datalake_v1.0_26-10-2022.pptx
Project_Plan-Datalake_v1.0_26-10-2022.pptx
ssuser3e2857
 
찾아가는 AWS 세미나(구로,가산,판교) - AWS 기반 빅데이터 활용 방법 (김일호 솔루션즈 아키텍트)
찾아가는 AWS 세미나(구로,가산,판교) - AWS 기반 빅데이터 활용 방법 (김일호 솔루션즈 아키텍트)찾아가는 AWS 세미나(구로,가산,판교) - AWS 기반 빅데이터 활용 방법 (김일호 솔루션즈 아키텍트)
찾아가는 AWS 세미나(구로,가산,판교) - AWS 기반 빅데이터 활용 방법 (김일호 솔루션즈 아키텍트)
Amazon Web Services Korea
 
Introducing the eDB360 Tool
Introducing the eDB360 ToolIntroducing the eDB360 Tool
Introducing the eDB360 Tool
Carlos Sierra
 

Semelhante a Fabian Hueske - Stream Analytics with SQL on Apache Flink (20)

Stream Analytics with SQL on Apache Flink - Fabian Hueske
Stream Analytics with SQL on Apache Flink - Fabian HueskeStream Analytics with SQL on Apache Flink - Fabian Hueske
Stream Analytics with SQL on Apache Flink - Fabian Hueske
 
Stream Analytics with SQL on Apache Flink
 Stream Analytics with SQL on Apache Flink Stream Analytics with SQL on Apache Flink
Stream Analytics with SQL on Apache Flink
 
Timo Walther - Table & SQL API - unified APIs for batch and stream processing
Timo Walther - Table & SQL API - unified APIs for batch and stream processingTimo Walther - Table & SQL API - unified APIs for batch and stream processing
Timo Walther - Table & SQL API - unified APIs for batch and stream processing
 
Project_Plan-Datalake_v1.0_26-10-2022.pptx
Project_Plan-Datalake_v1.0_26-10-2022.pptxProject_Plan-Datalake_v1.0_26-10-2022.pptx
Project_Plan-Datalake_v1.0_26-10-2022.pptx
 
Learn from HomeAway Hadoop Development and Operations Best Practices
Learn from HomeAway Hadoop Development and Operations Best PracticesLearn from HomeAway Hadoop Development and Operations Best Practices
Learn from HomeAway Hadoop Development and Operations Best Practices
 
Streaming SQL Foundations: Why I ❤ Streams+Tables
Streaming SQL Foundations: Why I ❤ Streams+TablesStreaming SQL Foundations: Why I ❤ Streams+Tables
Streaming SQL Foundations: Why I ❤ Streams+Tables
 
AWS Innovate: Running Databases in AWS- Russell Nash
AWS Innovate: Running Databases in AWS- Russell NashAWS Innovate: Running Databases in AWS- Russell Nash
AWS Innovate: Running Databases in AWS- Russell Nash
 
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
 
Couchbase Chennai Meetup: Developing with Couchbase- made easy
Couchbase Chennai Meetup:  Developing with Couchbase- made easyCouchbase Chennai Meetup:  Developing with Couchbase- made easy
Couchbase Chennai Meetup: Developing with Couchbase- made easy
 
Couchbase Singapore Meetup #2: Why Developing with Couchbase is easy !!
Couchbase Singapore Meetup #2:  Why Developing with Couchbase is easy !! Couchbase Singapore Meetup #2:  Why Developing with Couchbase is easy !!
Couchbase Singapore Meetup #2: Why Developing with Couchbase is easy !!
 
What's new in SQL Server 2017
What's new in SQL Server 2017What's new in SQL Server 2017
What's new in SQL Server 2017
 
NoSQL_Night
NoSQL_NightNoSQL_Night
NoSQL_Night
 
Delta Architecture
Delta ArchitectureDelta Architecture
Delta Architecture
 
Leveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data WarehouseLeveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data Warehouse
 
찾아가는 AWS 세미나(구로,가산,판교) - AWS 기반 빅데이터 활용 방법 (김일호 솔루션즈 아키텍트)
찾아가는 AWS 세미나(구로,가산,판교) - AWS 기반 빅데이터 활용 방법 (김일호 솔루션즈 아키텍트)찾아가는 AWS 세미나(구로,가산,판교) - AWS 기반 빅데이터 활용 방법 (김일호 솔루션즈 아키텍트)
찾아가는 AWS 세미나(구로,가산,판교) - AWS 기반 빅데이터 활용 방법 (김일호 솔루션즈 아키텍트)
 
Real-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQLReal-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQL
 
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, ConfluentTemporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent
 
Streaming SQL
Streaming SQLStreaming SQL
Streaming SQL
 
EBS on Oracle Cloud
EBS on Oracle CloudEBS on Oracle Cloud
EBS on Oracle Cloud
 
Introducing the eDB360 Tool
Introducing the eDB360 ToolIntroducing the eDB360 Tool
Introducing the eDB360 Tool
 

Mais de Ververica

Aljoscha Krettek - Apache Flink for IoT: How Event-Time Processing Enables Ea...
Aljoscha Krettek - Apache Flink for IoT: How Event-Time Processing Enables Ea...Aljoscha Krettek - Apache Flink for IoT: How Event-Time Processing Enables Ea...
Aljoscha Krettek - Apache Flink for IoT: How Event-Time Processing Enables Ea...
Ververica
 

Mais de Ververica (20)

2020-05-06 Apache Flink Meetup London: The Easiest Way to Get Operational wit...
2020-05-06 Apache Flink Meetup London: The Easiest Way to Get Operational wit...2020-05-06 Apache Flink Meetup London: The Easiest Way to Get Operational wit...
2020-05-06 Apache Flink Meetup London: The Easiest Way to Get Operational wit...
 
Webinar: How to contribute to Apache Flink - Robert Metzger
Webinar:  How to contribute to Apache Flink - Robert MetzgerWebinar:  How to contribute to Apache Flink - Robert Metzger
Webinar: How to contribute to Apache Flink - Robert Metzger
 
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanWebinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
 
Webinar: 99 Ways to Enrich Streaming Data with Apache Flink - Konstantin Knauf
Webinar: 99 Ways to Enrich Streaming Data with Apache Flink - Konstantin KnaufWebinar: 99 Ways to Enrich Streaming Data with Apache Flink - Konstantin Knauf
Webinar: 99 Ways to Enrich Streaming Data with Apache Flink - Konstantin Knauf
 
Webinar: Detecting row patterns with Flink SQL - Dawid Wysakowicz
Webinar:  Detecting row patterns with Flink SQL - Dawid WysakowiczWebinar:  Detecting row patterns with Flink SQL - Dawid Wysakowicz
Webinar: Detecting row patterns with Flink SQL - Dawid Wysakowicz
 
Deploying Flink on Kubernetes - David Anderson
 Deploying Flink on Kubernetes - David Anderson Deploying Flink on Kubernetes - David Anderson
Deploying Flink on Kubernetes - David Anderson
 
Webinar: Flink SQL in Action - Fabian Hueske
 Webinar: Flink SQL in Action - Fabian Hueske Webinar: Flink SQL in Action - Fabian Hueske
Webinar: Flink SQL in Action - Fabian Hueske
 
2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...
2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...
2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...
 
2018-01 Seattle Apache Flink Meetup at OfferUp, Opening Remarks and Talk 2
2018-01 Seattle Apache Flink Meetup at OfferUp, Opening Remarks and Talk 22018-01 Seattle Apache Flink Meetup at OfferUp, Opening Remarks and Talk 2
2018-01 Seattle Apache Flink Meetup at OfferUp, Opening Remarks and Talk 2
 
Stephan Ewen - Experiences running Flink at Very Large Scale
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large Scale
 
Tzu-Li (Gordon) Tai - Stateful Stream Processing with Apache Flink
Tzu-Li (Gordon) Tai - Stateful Stream Processing with Apache FlinkTzu-Li (Gordon) Tai - Stateful Stream Processing with Apache Flink
Tzu-Li (Gordon) Tai - Stateful Stream Processing with Apache Flink
 
Kostas Kloudas - Complex Event Processing with Flink: the state of FlinkCEP
Kostas Kloudas - Complex Event Processing with Flink: the state of FlinkCEP Kostas Kloudas - Complex Event Processing with Flink: the state of FlinkCEP
Kostas Kloudas - Complex Event Processing with Flink: the state of FlinkCEP
 
Aljoscha Krettek - Portable stateful big data processing in Apache Beam
Aljoscha Krettek - Portable stateful big data processing in Apache BeamAljoscha Krettek - Portable stateful big data processing in Apache Beam
Aljoscha Krettek - Portable stateful big data processing in Apache Beam
 
Aljoscha Krettek - Apache Flink® and IoT: How Stateful Event-Time Processing ...
Aljoscha Krettek - Apache Flink® and IoT: How Stateful Event-Time Processing ...Aljoscha Krettek - Apache Flink® and IoT: How Stateful Event-Time Processing ...
Aljoscha Krettek - Apache Flink® and IoT: How Stateful Event-Time Processing ...
 
Apache Flink Meetup: Sanjar Akhmedov - Joining Infinity – Windowless Stream ...
Apache Flink Meetup:  Sanjar Akhmedov - Joining Infinity – Windowless Stream ...Apache Flink Meetup:  Sanjar Akhmedov - Joining Infinity – Windowless Stream ...
Apache Flink Meetup: Sanjar Akhmedov - Joining Infinity – Windowless Stream ...
 
Kostas Kloudas - Extending Flink's Streaming APIs
Kostas Kloudas - Extending Flink's Streaming APIsKostas Kloudas - Extending Flink's Streaming APIs
Kostas Kloudas - Extending Flink's Streaming APIs
 
Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup
Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup
Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup
 
Robert Metzger - Apache Flink Community Updates November 2016 @ Berlin Meetup
Robert Metzger - Apache Flink Community Updates November 2016 @ Berlin Meetup Robert Metzger - Apache Flink Community Updates November 2016 @ Berlin Meetup
Robert Metzger - Apache Flink Community Updates November 2016 @ Berlin Meetup
 
Aljoscha Krettek - Apache Flink for IoT: How Event-Time Processing Enables Ea...
Aljoscha Krettek - Apache Flink for IoT: How Event-Time Processing Enables Ea...Aljoscha Krettek - Apache Flink for IoT: How Event-Time Processing Enables Ea...
Aljoscha Krettek - Apache Flink for IoT: How Event-Time Processing Enables Ea...
 
Kostas Tzoumas - Stream Processing with Apache Flink®
Kostas Tzoumas - Stream Processing with Apache Flink®Kostas Tzoumas - Stream Processing with Apache Flink®
Kostas Tzoumas - Stream Processing with Apache Flink®
 

Último

Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
gajnagarg
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
gajnagarg
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
chadhar227
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Klinik kandungan
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
HyderabadDolls
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
ahmedjiabur940
 
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
HyderabadDolls
 

Último (20)

SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Introduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptxIntroduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptx
 
Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
 
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
 

Fabian Hueske - Stream Analytics with SQL on Apache Flink

  • 1. FEBRUARY9, 2017, WARSAW Stream Analytics with SQL on Apache Flink® Fabian Hueske | Apache Flink PMC member | Co-founder dataArtisans
  • 3. FEBRUARY9, 2017, WARSAW Data Analytics on Streaming Data • Periodic batch processing • Lots of duct tape and baling wire • It’s up to you to make everything work… reliably! • High latency • Continuous stream processing • Framework takes care of failures • Low latency
  • 4. FEBRUARY9, 2017, WARSAW Stream Processing in Apache Flink • Platform for scalable stream processing • Fast • Low latency and high throughput • Accurate • Stateful streaming processing in event time • Reliable • Exactly-once state guarantees • Highly available cluster setup
  • 5. FEBRUARY9, 2017, WARSAW Streaming Applications Powered by Flink 30 Flink applications in production for more than one year. 10 billion events (2TB) processed daily Complex jobs of > 30 operators running 24/7, processing 30 billion events daily, maintaining state of 100s of GB with exactly-once guarantees Largest job has > 20 operators, runs on > 5000 vCores in 1000-node cluster, processes millions of events per second
  • 6. FEBRUARY9, 2017, WARSAW Stream Processing is not for Everybody, … yet • APIs of open source stream processors target developers • Implementing streaming applications requires knowledge & skill • Stream processing concepts (time, state, windows, triggers, ...) • Programming experience (Java / Scala APIs) • Stream processing technology spreads rapidly • There is a talent gap
  • 7. FEBRUARY9, 2017, WARSAW What about SQL? • SQL is the most widely used language for data analytics • Many good reasons to use SQL • Declarative specification • Optimization • Efficient execution • “Everybody” knows SQL • SQL would make stream processing much more accessible, but…
  • 8. FEBRUARY9, 2017, WARSAW No OS Stream Processor Offers Decent SQL Support • SQL was not designed with streaming data in mind • Relations are sets. Streams are infinite sequences. • Records arrive over time. • Syntax • Time-based operations are cumbersome to specify (aggregates, joins) • Semantics • A SQL query should compute the same result on a batch table and a stream
  • 9. FEBRUARY9, 2017, WARSAW • Standard SQL and LINQ-style Table API • Unified APIs for batch & streaming data • Common translation layers • Optimization based on Apache Calcite • Type system & code-generation • Table sources & sinks • Streaming SQL & Table API is work in progress Flink’s SQL Support & Table API
  • 10. FEBRUARY9, 2017, WARSAW What are the Use Cases for Stream SQL? • Continuous ETL & Data Import • Live Dashboards & Reports • Ad-hoc Analytics & Exploration
  • 11. FEBRUARY9, 2017, WARSAW Dynamic Tables • Core concept is a “Dynamic Table” • Dynamic tables change over time • Dynamic tables are treated like static batch tables • Dynamic tables are queried with standard SQL • A query returns another dynamic table • Stream ←→ Dynamic Table conversions without information loss • “Stream / Table Duality”
  • 12. FEBRUARY9, 2017, WARSAW Stream → Dynamic Table • Append • Replace by Key time k 1 A 2 B 4 A 5 C 7 B 8 A 9 B … … time k 2, B4, A5, C7, B8, A9, B 1, A 2, B4, A5, C7, B8, A9, B 1, A 8 A 9 B 5 C … …
  • 13. FEBRUARY9, 2017, WARSAW Querying a Dynamic Table • Dynamic tables change over time • A[t]: Table A at time t • Dynamic tables are queried with regular SQL • Result of a query changes as input table changes • q(A[t]): Evaluate query q on table A at time t • As time t progresses, the query result is continuously updated • similar to maintaining a materialized view • t is current event time
  • 14. FEBRUARY9, 2017, WARSAW Querying a Dynamic Table time k k cnt A 3 B 2 C 1 9 B k cnt A 3 B 3 C 1 12 C k cnt A 3 B 3 C 2 A[8] A[9] A[12] q(A[8]) q(A[9]) q(A[12]) Table A q: SELECT k, COUNT(k) as cnt FROM A GROUP BY k 1 A 2 B 4 A 5 C 7 B 8 A
  • 15. FEBRUARY9, 2017, WARSAW time k A[5] A[10] A[15] q(A[5]) q(A[10]) q(A[15]) Table A Querying a Dynamic Table 7 B 8 A 9 B 11 A 12 C 14 C 15 A k cnt endT A 2 5 B 1 5 C 1 5 q(A) A 1 10 B 2 10 A 2 15 C 2 15 q: SELECT k, COUNT(k) AS cnt, TUMBLE_END( time, INTERVAL '5' SECONDS) AS endT FROM A GROUP BY k, TUMBLE( time, INTERVAL '5' SECONDS) 1 A 2 B 4 A 5 C
  • 16. FEBRUARY9, 2017, WARSAW Can We Run Any Query on Dynamic Tables? • No  • There are state and computation constraints • State may not grow infinitely as more data arrives • Clean-up timeout must be defined • Input updates may only trigger partial re-computation of the result • Queries with possibly unbounded state or computation are rejected • Optimizer performs validation
  • 17. FEBRUARY9, 2017, WARSAW Bounding the State of a Query • State grows infinitely with domain of grouping attribute • Bound query input by time • Query aggregates data of last 24 hours. Older data is discarded. SELECT k, COUNT(k) AS cnt FROM A GROUP BY k SELECT k, COUNT(k) AS cnt FROM A WHERE last(time, INTERVAL ‘1’ DAY) GROUP BY k STOP! UNBOUNED STATE!
  • 18. FEBRUARY9, 2017, WARSAW Updating Results and Late Arriving Data • Sometimes emitted results need to be updated • Results which are continuously updated • Results for which relevant records arrived late • Results that might be updated must be kept as state • Clean-up timeout • When a table is converted into a stream, updates must be propagated • Update mode • Add/Retract mode
  • 19. FEBRUARY9, 2017, WARSAW Dynamic Table → Stream: Update Mode time k Table A B, 1A, 2C, 1B, 2A, 3 A, 1 SELECT k, COUNT(k) AS cnt FROM A GROUP BY k 1 A 2 B 4 A 5 C 7 B 8 A … … Update by Key
  • 20. FEBRUARY9, 2017, WARSAW Dynamic Table → Stream: Add/Retract Mode time k Table A + B, 1+ A, 2+ C, 1+ B, 2+ A, 3 + A, 1- A, 1- B, 1- A, 2 1 A 2 B 4 A 5 C 7 B 8 A … … SELECT k, COUNT(k) AS cnt FROM A GROUP BY k Add (+) / Retract (-)
  • 21. FEBRUARY9, 2017, WARSAW Current State of SQL and Table API • Huge interest and many contributors • Current development efforts • Adding more window operators • Introducing dynamic tables • And there is a lot more to do • New operators and features for streaming and batch • Performance improvements • Tooling and integration • Try it out, give feedback, and start contributing!
  • 22. FEBRUARY9, 2017, WARSAW Stream Analytics with SQL on Apache Flink Fabian Hueske | @fhueske