SlideShare uma empresa Scribd logo
1 de 69
Baixar para ler offline
Jonas Traub Philipp M. Grulich Alejandro Rodríguez Cuéllar Sebastian Breß
Asterios Katsifodimos Tilmann Rabl Volker Markl
Efficient Window Aggregation with
General Stream Slicing
22nd International Conference on Extending Database Technology
March 26-29, 2019, Lisbon, Portugal
Stream Processing Pipelines
27.03.2019 Efficient Window Aggregation with General Stream Slicing 2
A stream processing pipeline is a series of concurrently running operators.
Stream Processing Pipelines
27.03.2019 Efficient Window Aggregation with General Stream Slicing 2
A stream processing pipeline is a series of concurrently running operators.
Window
Aggregation
Stream Processing Pipelines
27.03.2019 Efficient Window Aggregation with General Stream Slicing 2
A stream processing pipeline is a series of concurrently running operators.
Window
Aggregation
53
Stream Processing Pipelines
27.03.2019 Efficient Window Aggregation with General Stream Slicing 2
A stream processing pipeline is a series of concurrently running operators.
Window
Aggregation
8
Motivation
27.03.2019 Efficient Window Aggregation with General Stream Slicing 3
Motivation
27.03.2019 Efficient Window Aggregation with General Stream Slicing 3
Stream Slicing Example
27.03.2019 Efficient Window Aggregation with General Stream Slicing 4
Stream Slicing Example
27.03.2019 Efficient Window Aggregation with General Stream Slicing 5
The number of slices depends on the workload.
Stream Slicing Example
27.03.2019 Efficient Window Aggregation with General Stream Slicing 5
Stream Slicing Example
27.03.2019 Efficient Window Aggregation with General Stream Slicing 6
Stream Slicing Example
27.03.2019 Efficient Window Aggregation with General Stream Slicing 7
Stream Slicing Example
27.03.2019 Efficient Window Aggregation with General Stream Slicing 8
Stream Slicing Example
27.03.2019 Efficient Window Aggregation with General Stream Slicing 9
We store partial aggregates instead of all tuples.  Small memory footprint.
Stream Slicing Example
27.03.2019 Efficient Window Aggregation with General Stream Slicing 9
Stream Slicing Example
27.03.2019 Efficient Window Aggregation with General Stream Slicing 10
We assign each tuple to exactly one slice.  O(1) per-tuple complexity.
Stream Slicing Example
27.03.2019 Efficient Window Aggregation with General Stream Slicing 10
Stream Slicing Example
27.03.2019 Efficient Window Aggregation with General Stream Slicing 11
We require just a few computation steps to calculate final aggregates.  Low latency.
Stream Slicing Example
27.03.2019 Efficient Window Aggregation with General Stream Slicing 11
Stream Slicing Example
27.03.2019 Efficient Window Aggregation with General Stream Slicing 12
We share partial aggregations among all users and queries.  Efficiency by preventing redundancy.
Stream Slicing Example
27.03.2019 Efficient Window Aggregation with General Stream Slicing 12
General Stream Slicing
27.03.2019 Efficient Window Aggregation with General Stream Slicing 13
General Stream Slicing
Workload
Characteristics
27.03.2019 Efficient Window Aggregation with General Stream Slicing 13
General Stream Slicing
Workload
Characteristics
Aggregation
Functions
distributive
algebraic
holistic
associativity
cummutativity
invertibility
27.03.2019 Efficient Window Aggregation with General Stream Slicing 13
General Stream Slicing
Workload
Characteristics
Window
Types
Context Free
Forward Context Free
Forward Context Aware
Aggregation
Functions
distributive
algebraic
holistic
associativity
cummutativity
invertibility
27.03.2019 Efficient Window Aggregation with General Stream Slicing 13
General Stream Slicing
Workload
Characteristics
Window
Types
Context Free
Forward Context Free
Forward Context Aware
Window
Measures
time
tuple count
arbitrary
Aggregation
Functions
distributive
algebraic
holistic
associativity
cummutativity
invertibility
27.03.2019 Efficient Window Aggregation with General Stream Slicing 13
General Stream Slicing
Workload
Characteristics
Window
Types
Context Free
Forward Context Free
Forward Context Aware
Stream
Order
in-order
out-of-order
Window
Measures
time
tuple count
arbitrary
Aggregation
Functions
distributive
algebraic
holistic
associativity
cummutativity
invertibility
27.03.2019 Efficient Window Aggregation with General Stream Slicing 13
General Stream Slicing
Workload
Characteristics
Window
Types
Context Free
Forward Context Free
Forward Context Aware
Stream
Order
in-order
out-of-order
Window
Measures
time
tuple count
arbitrary
Aggregation
Functions
distributive
algebraic
holistic
associativity
cummutativity
invertibility
27.03.2019 Efficient Window Aggregation with General Stream Slicing 13
General Stream Slicing combines generality and efficiency in a single solution.
Window Aggregation Concepts
27.03.2019 Efficient Window Aggregation with General Stream Slicing 14
Variations of Stream SlicingNon-Slicing Techniques
General Slicing Core
27.03.2019 Efficient Window Aggregation with General Stream Slicing 15
General Slicing Core
The General Slicing Core adapts to work load characteristics
and provides extension point for user-defined window types and aggregation functions.
27.03.2019 Efficient Window Aggregation with General Stream Slicing 15
General Stream Slicing Internals
27.03.2019 Efficient Window Aggregation with General Stream Slicing 16
General Stream Slicing Internals
27.03.2019 Efficient Window Aggregation with General Stream Slicing 16
Part 1: Three Fundamental Operations on Slices
General Stream Slicing Internals
27.03.2019 Efficient Window Aggregation with General Stream Slicing 16
Merge Slices
Part 1: Three Fundamental Operations on Slices
General Stream Slicing Internals
27.03.2019 Efficient Window Aggregation with General Stream Slicing 16
Merge Slices Split Slices
Part 1: Three Fundamental Operations on Slices
General Stream Slicing Internals
27.03.2019 Efficient Window Aggregation with General Stream Slicing 16
Merge Slices Split Slices Update Slices
Part 1: Three Fundamental Operations on Slices
General Stream Slicing Internals
27.03.2019 Efficient Window Aggregation with General Stream Slicing 16
Merge Slices Split Slices Update Slices
Part 1: Three Fundamental Operations on Slices
Part 2: Adapt to Workload Characteristics:
General Stream Slicing Internals
27.03.2019 Efficient Window Aggregation with General Stream Slicing 16
Merge Slices Split Slices Update Slices
Part 1: Three Fundamental Operations on Slices
Part 2: Adapt to Workload Characteristics:
Do we need to store original tuples?
General Stream Slicing Internals
27.03.2019 Efficient Window Aggregation with General Stream Slicing 16
Merge Slices Split Slices Update Slices
Part 1: Three Fundamental Operations on Slices
Part 2: Adapt to Workload Characteristics:
Do we need to store original tuples?
Do we potentially need to split slices?
General Stream Slicing Internals
27.03.2019 Efficient Window Aggregation with General Stream Slicing 16
Merge Slices Split Slices Update Slices
Part 1: Three Fundamental Operations on Slices
Part 2: Adapt to Workload Characteristics:
Do we need to store original tuples?
Do we potentially need to split slices?
Do we potentially need
to remove tuples from slices?
General Stream Slicing Internals
27.03.2019 Efficient Window Aggregation with General Stream Slicing 16
Merge Slices Split Slices Update Slices
Part 1: Three Fundamental Operations on Slices
Part 2: Adapt to Workload Characteristics:
Do we need to store original tuples?
Do we potentially need to split slices?
Do we potentially need
to remove tuples from slices?
General Stream Slicing Internals
27.03.2019 Efficient Window Aggregation with General Stream Slicing 16
Merge Slices Split Slices Update Slices
Part 1: Three Fundamental Operations on Slices
Part 2: Adapt to Workload Characteristics:
Do we need to store original tuples?
Do we potentially need to split slices?
Do we potentially need
to remove tuples from slices?
General Stream Slicing adapts to current workload characteristics.
Impact of Workload Characteristics (Example)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 17
Impact of Workload Characteristics (Example)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 17
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
Impact of Workload Characteristics (Example)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 17
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
Count-based tumbling window
with a length of 5 tuples.
Impact of Workload Characteristics (Example)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 17
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Count-based tumbling window
with a length of 5 tuples.
Impact of Workload Characteristics (Example)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 17
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Count-based tumbling window
with a length of 5 tuples.
11 13 12
Impact of Workload Characteristics (Example)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 17
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
11 13 12
What if the stream is out-of-order?
Impact of Workload Characteristics (Example)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 17
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Event Time
5 12 13 20 35 37 42 46 48 51 52 57 63 64 65
11 13 12
What if the stream is out-of-order?
Impact of Workload Characteristics (Example)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 17
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Event Time
5 12 13 20 35 37 42 46 48 51 52 57 63 64 65
11 13 12
What if the stream is out-of-order?
5
49
Out-of-order Tuple
Impact of Workload Characteristics (Example)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 17
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Event Time
5 12 13 20 35 37 42 46 48 51 52 57 63 64 65
11 13 12
What if the stream is out-of-order?
5
49
Out-of-order Tuple
Impact of Workload Characteristics (Example)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 17
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Event Time
5 12 13 20 35 37 42 46 48 51 52 57 63 64 65
11 13 12
What if the stream is out-of-order?
5
49
Impact of Workload Characteristics (Example)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 17
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Event Time
5 12 13 20 35 37 42 46 48 51 52 57 63 64 65
11 13 12
What if the stream is out-of-order?
5
49
13 12
Impact of Workload Characteristics (Example)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 17
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Event Time
5 12 13 20 35 37 42 46 48 51 52 57 63 64 65
11 13 12
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
What if the stream is out-of-order?
5
49
13 12
Impact of Workload Characteristics (Example)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 17
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Event Time
5 12 13 20 35 37 42 46 48 51 52 57 63 64 65
11 13 12
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
What if the stream is out-of-order?
5
49
13 12
5
Impact of Workload Characteristics (Example)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 17
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Event Time
5 12 13 20 35 37 42 46 48 51 52 57 63 64 65
11 13 12
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
What if the stream is out-of-order?
5
49
13 125 + - 3
5
Impact of Workload Characteristics (Example)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 17
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Event Time
5 12 13 20 35 37 42 46 48 51 52 57 63 64 65
11 13 12
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
What if the stream is out-of-order?
5
49
13 123 1+ -5 + - 3
5
Impact of Workload Characteristics (Example)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 17
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Event Time
5 12 13 20 35 37 42 46 48 51 52 57 63 64 65
11 13 12
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
What if the stream is out-of-order?
5
49
13 123 1+ -5 + - 3
5
What if the aggregation function is not invertible?
In-order Processing with Context Free Windows
27.03.2019 Efficient Window Aggregation with General Stream Slicing 18
In-order Processing with Context Free Windows
27.03.2019 Efficient Window Aggregation with General Stream Slicing 18
Slicing techniques scale to large numbers of concurrent windows.
Impact of Stream Order
27.03.2019 Efficient Window Aggregation with General Stream Slicing 19
Impact of Stream Order
27.03.2019 Efficient Window Aggregation with General Stream Slicing 19
Slicing techniques are robust against out-of-order tuples.
Impact of Aggregation Functions (20% out-of-order)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 20
Impact of Aggregation Functions (20% out-of-order)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 20
Stream Slicing performs well on many different kinds of aggregation functions.
Efficient Window Aggregation with General Stream Slicing
27.03.2019 Efficient Window Aggregation with General Stream Slicing 21
Efficient Window Aggregation with General Stream Slicing
• We identify workload characteristics which impact
applicability and performance of window aggregation techniques.
27.03.2019 Efficient Window Aggregation with General Stream Slicing 21
Efficient Window Aggregation with General Stream Slicing
• We identify workload characteristics which impact
applicability and performance of window aggregation techniques.
• We present a generally applicable and highly efficient solution for
streaming window aggregation.
27.03.2019 Efficient Window Aggregation with General Stream Slicing 21
Efficient Window Aggregation with General Stream Slicing
• We identify workload characteristics which impact
applicability and performance of window aggregation techniques.
• We present a generally applicable and highly efficient solution for
streaming window aggregation.
• We show that general stream slicing is generally applicable and
offers better performance than alternative approaches.
27.03.2019 Efficient Window Aggregation with General Stream Slicing 21
Efficient Window Aggregation with General Stream Slicing
• We identify workload characteristics which impact
applicability and performance of window aggregation techniques.
• We present a generally applicable and highly efficient solution for
streaming window aggregation.
• We show that general stream slicing is generally applicable and
offers better performance than alternative approaches.
27.03.2019 Efficient Window Aggregation with General Stream Slicing 21
tu-berlin-dima.github.io/scotty-window-processor
Open Source Repository:

Mais conteúdo relacionado

Semelhante a Efficient Window Aggregation with General Stream Slicing

CNC Programming
CNC ProgrammingCNC Programming
CNC ProgrammingMal Moran
 
Trouble shooting Storage Area Networks for virtualisation deployments
Trouble shooting Storage Area Networks for virtualisation deploymentsTrouble shooting Storage Area Networks for virtualisation deployments
Trouble shooting Storage Area Networks for virtualisation deploymentsKevin Walker
 
Covid Hazardous Waste Management System
Covid Hazardous Waste Management SystemCovid Hazardous Waste Management System
Covid Hazardous Waste Management SystemIRJET Journal
 
Warehouses Energy Consumption using Solar Energy with the help of Blockchain
Warehouses Energy Consumption using Solar Energy with the help of BlockchainWarehouses Energy Consumption using Solar Energy with the help of Blockchain
Warehouses Energy Consumption using Solar Energy with the help of BlockchainIRJET Journal
 
Big data ET models & benchmarking with distributed OSGEO tools
Big data ET models & benchmarking with distributed OSGEO toolsBig data ET models & benchmarking with distributed OSGEO tools
Big data ET models & benchmarking with distributed OSGEO toolsHirofumi Hayashi
 
Stream processing comparison
Stream processing comparisonStream processing comparison
Stream processing comparisonYangjun Wang
 
code.talks 2019 - Scotty: Efficient Window Aggregation for your Stream Proces...
code.talks 2019 - Scotty: Efficient Window Aggregation for your Stream Proces...code.talks 2019 - Scotty: Efficient Window Aggregation for your Stream Proces...
code.talks 2019 - Scotty: Efficient Window Aggregation for your Stream Proces...Jonas Traub
 
Simulations Part III.pdf
Simulations Part III.pdfSimulations Part III.pdf
Simulations Part III.pdfJeanMarshall8
 
Cloud-based Integrated Process Planning and Scheduling Optimisation via Asyn...
 Cloud-based Integrated Process Planning and Scheduling Optimisation via Asyn... Cloud-based Integrated Process Planning and Scheduling Optimisation via Asyn...
Cloud-based Integrated Process Planning and Scheduling Optimisation via Asyn...Piotr Dziurzanski
 
Simulations Part III.pdf
Simulations Part III.pdfSimulations Part III.pdf
Simulations Part III.pdfJeanMarshall8
 
IRJET- Effect of Floating Column on Structral Frames During Seismic Forces
IRJET- Effect of Floating Column on Structral Frames During Seismic ForcesIRJET- Effect of Floating Column on Structral Frames During Seismic Forces
IRJET- Effect of Floating Column on Structral Frames During Seismic ForcesIRJET Journal
 
IRJET- A Review on Design and Fabrication of a Solar Roadways
IRJET- A Review on Design and Fabrication of a Solar RoadwaysIRJET- A Review on Design and Fabrication of a Solar Roadways
IRJET- A Review on Design and Fabrication of a Solar RoadwaysIRJET Journal
 
LecturePPT_Unit_3b_AY2021-22_TechngVrsn.ppt
LecturePPT_Unit_3b_AY2021-22_TechngVrsn.pptLecturePPT_Unit_3b_AY2021-22_TechngVrsn.ppt
LecturePPT_Unit_3b_AY2021-22_TechngVrsn.pptPhoenixEagles
 
Comments on Simulations Project Parts I & II Marking Contingencies.pdf
Comments on Simulations Project Parts I & II Marking Contingencies.pdfComments on Simulations Project Parts I & II Marking Contingencies.pdf
Comments on Simulations Project Parts I & II Marking Contingencies.pdfBrij Consulting, LLC
 
Agile DDD Genuin Objects
Agile DDD Genuin ObjectsAgile DDD Genuin Objects
Agile DDD Genuin ObjectsJukka Tamminen
 
Frontend performance on the web
Frontend performance on the webFrontend performance on the web
Frontend performance on the webTRITUM
 

Semelhante a Efficient Window Aggregation with General Stream Slicing (20)

CNC Programming
CNC ProgrammingCNC Programming
CNC Programming
 
Trouble shooting Storage Area Networks for virtualisation deployments
Trouble shooting Storage Area Networks for virtualisation deploymentsTrouble shooting Storage Area Networks for virtualisation deployments
Trouble shooting Storage Area Networks for virtualisation deployments
 
Covid Hazardous Waste Management System
Covid Hazardous Waste Management SystemCovid Hazardous Waste Management System
Covid Hazardous Waste Management System
 
Warehouses Energy Consumption using Solar Energy with the help of Blockchain
Warehouses Energy Consumption using Solar Energy with the help of BlockchainWarehouses Energy Consumption using Solar Energy with the help of Blockchain
Warehouses Energy Consumption using Solar Energy with the help of Blockchain
 
Big data ET models & benchmarking with distributed OSGEO tools
Big data ET models & benchmarking with distributed OSGEO toolsBig data ET models & benchmarking with distributed OSGEO tools
Big data ET models & benchmarking with distributed OSGEO tools
 
Stream processing comparison
Stream processing comparisonStream processing comparison
Stream processing comparison
 
code.talks 2019 - Scotty: Efficient Window Aggregation for your Stream Proces...
code.talks 2019 - Scotty: Efficient Window Aggregation for your Stream Proces...code.talks 2019 - Scotty: Efficient Window Aggregation for your Stream Proces...
code.talks 2019 - Scotty: Efficient Window Aggregation for your Stream Proces...
 
Simulations Part III.pdf
Simulations Part III.pdfSimulations Part III.pdf
Simulations Part III.pdf
 
Cloud-based Integrated Process Planning and Scheduling Optimisation via Asyn...
 Cloud-based Integrated Process Planning and Scheduling Optimisation via Asyn... Cloud-based Integrated Process Planning and Scheduling Optimisation via Asyn...
Cloud-based Integrated Process Planning and Scheduling Optimisation via Asyn...
 
Simulations Part III.pdf
Simulations Part III.pdfSimulations Part III.pdf
Simulations Part III.pdf
 
Presentation1
Presentation1Presentation1
Presentation1
 
IRJET- Effect of Floating Column on Structral Frames During Seismic Forces
IRJET- Effect of Floating Column on Structral Frames During Seismic ForcesIRJET- Effect of Floating Column on Structral Frames During Seismic Forces
IRJET- Effect of Floating Column on Structral Frames During Seismic Forces
 
jit.pptx
jit.pptxjit.pptx
jit.pptx
 
IRJET- A Review on Design and Fabrication of a Solar Roadways
IRJET- A Review on Design and Fabrication of a Solar RoadwaysIRJET- A Review on Design and Fabrication of a Solar Roadways
IRJET- A Review on Design and Fabrication of a Solar Roadways
 
Little gems in TYPO3 v9
Little gems in TYPO3 v9Little gems in TYPO3 v9
Little gems in TYPO3 v9
 
LecturePPT_Unit_3b_AY2021-22_TechngVrsn.ppt
LecturePPT_Unit_3b_AY2021-22_TechngVrsn.pptLecturePPT_Unit_3b_AY2021-22_TechngVrsn.ppt
LecturePPT_Unit_3b_AY2021-22_TechngVrsn.ppt
 
Comments on Simulations Project Parts I & II Marking Contingencies.pdf
Comments on Simulations Project Parts I & II Marking Contingencies.pdfComments on Simulations Project Parts I & II Marking Contingencies.pdf
Comments on Simulations Project Parts I & II Marking Contingencies.pdf
 
Agile DDD Genuin Objects
Agile DDD Genuin ObjectsAgile DDD Genuin Objects
Agile DDD Genuin Objects
 
Frontend performance on the web
Frontend performance on the webFrontend performance on the web
Frontend performance on the web
 
26 corbellini random forest for mismatch
26 corbellini random forest for mismatch26 corbellini random forest for mismatch
26 corbellini random forest for mismatch
 

Mais de Jonas Traub

Definitely not Java! A Hands-on Introduction to Efficient Functional Programm...
Definitely not Java! A Hands-on Introduction to Efficient Functional Programm...Definitely not Java! A Hands-on Introduction to Efficient Functional Programm...
Definitely not Java! A Hands-on Introduction to Efficient Functional Programm...Jonas Traub
 
Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...
Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...
Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...Jonas Traub
 
FlinkForward Berlin 2019 - Scotty: Efficient Window Aggregation with General ...
FlinkForward Berlin 2019 - Scotty: Efficient Window Aggregation with General ...FlinkForward Berlin 2019 - Scotty: Efficient Window Aggregation with General ...
FlinkForward Berlin 2019 - Scotty: Efficient Window Aggregation with General ...Jonas Traub
 
Analyzing Efficient Stream Processing on Modern Hardware (VLDB 2019 Presentat...
Analyzing Efficient Stream Processing on Modern Hardware (VLDB 2019 Presentat...Analyzing Efficient Stream Processing on Modern Hardware (VLDB 2019 Presentat...
Analyzing Efficient Stream Processing on Modern Hardware (VLDB 2019 Presentat...Jonas Traub
 
Database Research at TU Berlin DIMA and DFKI IAM - USA Excursion Slides 2019
Database Research at TU Berlin DIMA and DFKI IAM - USA Excursion Slides 2019Database Research at TU Berlin DIMA and DFKI IAM - USA Excursion Slides 2019
Database Research at TU Berlin DIMA and DFKI IAM - USA Excursion Slides 2019Jonas Traub
 
Resense: Transparent Record and Replay of Sensor Data in the Internet of Thin...
Resense: Transparent Record and Replay of Sensor Data in the Internet of Thin...Resense: Transparent Record and Replay of Sensor Data in the Internet of Thin...
Resense: Transparent Record and Replay of Sensor Data in the Internet of Thin...Jonas Traub
 
Flink Forward 2018: Efficient Window Aggregation with Stream Slicing
Flink Forward 2018: Efficient Window Aggregation with Stream SlicingFlink Forward 2018: Efficient Window Aggregation with Stream Slicing
Flink Forward 2018: Efficient Window Aggregation with Stream SlicingJonas Traub
 
Scotty: Efficient Window Aggregation for Out-of-Order Stream Processing
Scotty: Efficient Window Aggregation for Out-of-Order Stream ProcessingScotty: Efficient Window Aggregation for Out-of-Order Stream Processing
Scotty: Efficient Window Aggregation for Out-of-Order Stream ProcessingJonas Traub
 
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...Jonas Traub
 
Efficient SIMD Vectorization for Hashing in OpenCL
Efficient SIMD Vectorization for Hashing in OpenCLEfficient SIMD Vectorization for Hashing in OpenCL
Efficient SIMD Vectorization for Hashing in OpenCLJonas Traub
 
UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...
UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...
UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...Jonas Traub
 
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...Jonas Traub
 
I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...
I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...
I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...Jonas Traub
 
I²: Interactive Real-Time Visualization for Streaming Data
I²: Interactive Real-Time Visualization for Streaming DataI²: Interactive Real-Time Visualization for Streaming Data
I²: Interactive Real-Time Visualization for Streaming DataJonas Traub
 
LWA 2015: The Apache Flink Platform (Poster)
LWA 2015: The Apache Flink Platform (Poster)LWA 2015: The Apache Flink Platform (Poster)
LWA 2015: The Apache Flink Platform (Poster)Jonas Traub
 
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream AnalysisLWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream AnalysisJonas Traub
 

Mais de Jonas Traub (16)

Definitely not Java! A Hands-on Introduction to Efficient Functional Programm...
Definitely not Java! A Hands-on Introduction to Efficient Functional Programm...Definitely not Java! A Hands-on Introduction to Efficient Functional Programm...
Definitely not Java! A Hands-on Introduction to Efficient Functional Programm...
 
Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...
Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...
Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...
 
FlinkForward Berlin 2019 - Scotty: Efficient Window Aggregation with General ...
FlinkForward Berlin 2019 - Scotty: Efficient Window Aggregation with General ...FlinkForward Berlin 2019 - Scotty: Efficient Window Aggregation with General ...
FlinkForward Berlin 2019 - Scotty: Efficient Window Aggregation with General ...
 
Analyzing Efficient Stream Processing on Modern Hardware (VLDB 2019 Presentat...
Analyzing Efficient Stream Processing on Modern Hardware (VLDB 2019 Presentat...Analyzing Efficient Stream Processing on Modern Hardware (VLDB 2019 Presentat...
Analyzing Efficient Stream Processing on Modern Hardware (VLDB 2019 Presentat...
 
Database Research at TU Berlin DIMA and DFKI IAM - USA Excursion Slides 2019
Database Research at TU Berlin DIMA and DFKI IAM - USA Excursion Slides 2019Database Research at TU Berlin DIMA and DFKI IAM - USA Excursion Slides 2019
Database Research at TU Berlin DIMA and DFKI IAM - USA Excursion Slides 2019
 
Resense: Transparent Record and Replay of Sensor Data in the Internet of Thin...
Resense: Transparent Record and Replay of Sensor Data in the Internet of Thin...Resense: Transparent Record and Replay of Sensor Data in the Internet of Thin...
Resense: Transparent Record and Replay of Sensor Data in the Internet of Thin...
 
Flink Forward 2018: Efficient Window Aggregation with Stream Slicing
Flink Forward 2018: Efficient Window Aggregation with Stream SlicingFlink Forward 2018: Efficient Window Aggregation with Stream Slicing
Flink Forward 2018: Efficient Window Aggregation with Stream Slicing
 
Scotty: Efficient Window Aggregation for Out-of-Order Stream Processing
Scotty: Efficient Window Aggregation for Out-of-Order Stream ProcessingScotty: Efficient Window Aggregation for Out-of-Order Stream Processing
Scotty: Efficient Window Aggregation for Out-of-Order Stream Processing
 
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...
 
Efficient SIMD Vectorization for Hashing in OpenCL
Efficient SIMD Vectorization for Hashing in OpenCLEfficient SIMD Vectorization for Hashing in OpenCL
Efficient SIMD Vectorization for Hashing in OpenCL
 
UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...
UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...
UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...
 
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...
 
I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...
I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...
I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...
 
I²: Interactive Real-Time Visualization for Streaming Data
I²: Interactive Real-Time Visualization for Streaming DataI²: Interactive Real-Time Visualization for Streaming Data
I²: Interactive Real-Time Visualization for Streaming Data
 
LWA 2015: The Apache Flink Platform (Poster)
LWA 2015: The Apache Flink Platform (Poster)LWA 2015: The Apache Flink Platform (Poster)
LWA 2015: The Apache Flink Platform (Poster)
 
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream AnalysisLWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis
 

Último

Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 

Último (20)

Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 

Efficient Window Aggregation with General Stream Slicing

  • 1. Jonas Traub Philipp M. Grulich Alejandro Rodríguez Cuéllar Sebastian Breß Asterios Katsifodimos Tilmann Rabl Volker Markl Efficient Window Aggregation with General Stream Slicing 22nd International Conference on Extending Database Technology March 26-29, 2019, Lisbon, Portugal
  • 2. Stream Processing Pipelines 27.03.2019 Efficient Window Aggregation with General Stream Slicing 2 A stream processing pipeline is a series of concurrently running operators.
  • 3. Stream Processing Pipelines 27.03.2019 Efficient Window Aggregation with General Stream Slicing 2 A stream processing pipeline is a series of concurrently running operators. Window Aggregation
  • 4. Stream Processing Pipelines 27.03.2019 Efficient Window Aggregation with General Stream Slicing 2 A stream processing pipeline is a series of concurrently running operators. Window Aggregation 53
  • 5. Stream Processing Pipelines 27.03.2019 Efficient Window Aggregation with General Stream Slicing 2 A stream processing pipeline is a series of concurrently running operators. Window Aggregation 8
  • 6. Motivation 27.03.2019 Efficient Window Aggregation with General Stream Slicing 3
  • 7. Motivation 27.03.2019 Efficient Window Aggregation with General Stream Slicing 3
  • 8. Stream Slicing Example 27.03.2019 Efficient Window Aggregation with General Stream Slicing 4
  • 9. Stream Slicing Example 27.03.2019 Efficient Window Aggregation with General Stream Slicing 5
  • 10. The number of slices depends on the workload. Stream Slicing Example 27.03.2019 Efficient Window Aggregation with General Stream Slicing 5
  • 11. Stream Slicing Example 27.03.2019 Efficient Window Aggregation with General Stream Slicing 6
  • 12. Stream Slicing Example 27.03.2019 Efficient Window Aggregation with General Stream Slicing 7
  • 13. Stream Slicing Example 27.03.2019 Efficient Window Aggregation with General Stream Slicing 8
  • 14. Stream Slicing Example 27.03.2019 Efficient Window Aggregation with General Stream Slicing 9
  • 15. We store partial aggregates instead of all tuples.  Small memory footprint. Stream Slicing Example 27.03.2019 Efficient Window Aggregation with General Stream Slicing 9
  • 16. Stream Slicing Example 27.03.2019 Efficient Window Aggregation with General Stream Slicing 10
  • 17. We assign each tuple to exactly one slice.  O(1) per-tuple complexity. Stream Slicing Example 27.03.2019 Efficient Window Aggregation with General Stream Slicing 10
  • 18. Stream Slicing Example 27.03.2019 Efficient Window Aggregation with General Stream Slicing 11
  • 19. We require just a few computation steps to calculate final aggregates.  Low latency. Stream Slicing Example 27.03.2019 Efficient Window Aggregation with General Stream Slicing 11
  • 20. Stream Slicing Example 27.03.2019 Efficient Window Aggregation with General Stream Slicing 12
  • 21. We share partial aggregations among all users and queries.  Efficiency by preventing redundancy. Stream Slicing Example 27.03.2019 Efficient Window Aggregation with General Stream Slicing 12
  • 22. General Stream Slicing 27.03.2019 Efficient Window Aggregation with General Stream Slicing 13
  • 23. General Stream Slicing Workload Characteristics 27.03.2019 Efficient Window Aggregation with General Stream Slicing 13
  • 25. General Stream Slicing Workload Characteristics Window Types Context Free Forward Context Free Forward Context Aware Aggregation Functions distributive algebraic holistic associativity cummutativity invertibility 27.03.2019 Efficient Window Aggregation with General Stream Slicing 13
  • 26. General Stream Slicing Workload Characteristics Window Types Context Free Forward Context Free Forward Context Aware Window Measures time tuple count arbitrary Aggregation Functions distributive algebraic holistic associativity cummutativity invertibility 27.03.2019 Efficient Window Aggregation with General Stream Slicing 13
  • 27. General Stream Slicing Workload Characteristics Window Types Context Free Forward Context Free Forward Context Aware Stream Order in-order out-of-order Window Measures time tuple count arbitrary Aggregation Functions distributive algebraic holistic associativity cummutativity invertibility 27.03.2019 Efficient Window Aggregation with General Stream Slicing 13
  • 28. General Stream Slicing Workload Characteristics Window Types Context Free Forward Context Free Forward Context Aware Stream Order in-order out-of-order Window Measures time tuple count arbitrary Aggregation Functions distributive algebraic holistic associativity cummutativity invertibility 27.03.2019 Efficient Window Aggregation with General Stream Slicing 13 General Stream Slicing combines generality and efficiency in a single solution.
  • 29. Window Aggregation Concepts 27.03.2019 Efficient Window Aggregation with General Stream Slicing 14 Variations of Stream SlicingNon-Slicing Techniques
  • 30. General Slicing Core 27.03.2019 Efficient Window Aggregation with General Stream Slicing 15
  • 31. General Slicing Core The General Slicing Core adapts to work load characteristics and provides extension point for user-defined window types and aggregation functions. 27.03.2019 Efficient Window Aggregation with General Stream Slicing 15
  • 32. General Stream Slicing Internals 27.03.2019 Efficient Window Aggregation with General Stream Slicing 16
  • 33. General Stream Slicing Internals 27.03.2019 Efficient Window Aggregation with General Stream Slicing 16 Part 1: Three Fundamental Operations on Slices
  • 34. General Stream Slicing Internals 27.03.2019 Efficient Window Aggregation with General Stream Slicing 16 Merge Slices Part 1: Three Fundamental Operations on Slices
  • 35. General Stream Slicing Internals 27.03.2019 Efficient Window Aggregation with General Stream Slicing 16 Merge Slices Split Slices Part 1: Three Fundamental Operations on Slices
  • 36. General Stream Slicing Internals 27.03.2019 Efficient Window Aggregation with General Stream Slicing 16 Merge Slices Split Slices Update Slices Part 1: Three Fundamental Operations on Slices
  • 37. General Stream Slicing Internals 27.03.2019 Efficient Window Aggregation with General Stream Slicing 16 Merge Slices Split Slices Update Slices Part 1: Three Fundamental Operations on Slices Part 2: Adapt to Workload Characteristics:
  • 38. General Stream Slicing Internals 27.03.2019 Efficient Window Aggregation with General Stream Slicing 16 Merge Slices Split Slices Update Slices Part 1: Three Fundamental Operations on Slices Part 2: Adapt to Workload Characteristics: Do we need to store original tuples?
  • 39. General Stream Slicing Internals 27.03.2019 Efficient Window Aggregation with General Stream Slicing 16 Merge Slices Split Slices Update Slices Part 1: Three Fundamental Operations on Slices Part 2: Adapt to Workload Characteristics: Do we need to store original tuples? Do we potentially need to split slices?
  • 40. General Stream Slicing Internals 27.03.2019 Efficient Window Aggregation with General Stream Slicing 16 Merge Slices Split Slices Update Slices Part 1: Three Fundamental Operations on Slices Part 2: Adapt to Workload Characteristics: Do we need to store original tuples? Do we potentially need to split slices? Do we potentially need to remove tuples from slices?
  • 41. General Stream Slicing Internals 27.03.2019 Efficient Window Aggregation with General Stream Slicing 16 Merge Slices Split Slices Update Slices Part 1: Three Fundamental Operations on Slices Part 2: Adapt to Workload Characteristics: Do we need to store original tuples? Do we potentially need to split slices? Do we potentially need to remove tuples from slices?
  • 42. General Stream Slicing Internals 27.03.2019 Efficient Window Aggregation with General Stream Slicing 16 Merge Slices Split Slices Update Slices Part 1: Three Fundamental Operations on Slices Part 2: Adapt to Workload Characteristics: Do we need to store original tuples? Do we potentially need to split slices? Do we potentially need to remove tuples from slices? General Stream Slicing adapts to current workload characteristics.
  • 43. Impact of Workload Characteristics (Example) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 17
  • 44. Impact of Workload Characteristics (Example) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 17 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
  • 45. Impact of Workload Characteristics (Example) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 17 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 Count-based tumbling window with a length of 5 tuples.
  • 46. Impact of Workload Characteristics (Example) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 17 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Count-based tumbling window with a length of 5 tuples.
  • 47. Impact of Workload Characteristics (Example) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 17 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Count-based tumbling window with a length of 5 tuples. 11 13 12
  • 48. Impact of Workload Characteristics (Example) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 17 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 11 13 12 What if the stream is out-of-order?
  • 49. Impact of Workload Characteristics (Example) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 17 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Event Time 5 12 13 20 35 37 42 46 48 51 52 57 63 64 65 11 13 12 What if the stream is out-of-order?
  • 50. Impact of Workload Characteristics (Example) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 17 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Event Time 5 12 13 20 35 37 42 46 48 51 52 57 63 64 65 11 13 12 What if the stream is out-of-order? 5 49 Out-of-order Tuple
  • 51. Impact of Workload Characteristics (Example) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 17 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Event Time 5 12 13 20 35 37 42 46 48 51 52 57 63 64 65 11 13 12 What if the stream is out-of-order? 5 49 Out-of-order Tuple
  • 52. Impact of Workload Characteristics (Example) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 17 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Event Time 5 12 13 20 35 37 42 46 48 51 52 57 63 64 65 11 13 12 What if the stream is out-of-order? 5 49
  • 53. Impact of Workload Characteristics (Example) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 17 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Event Time 5 12 13 20 35 37 42 46 48 51 52 57 63 64 65 11 13 12 What if the stream is out-of-order? 5 49 13 12
  • 54. Impact of Workload Characteristics (Example) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 17 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Event Time 5 12 13 20 35 37 42 46 48 51 52 57 63 64 65 11 13 12 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 What if the stream is out-of-order? 5 49 13 12
  • 55. Impact of Workload Characteristics (Example) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 17 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Event Time 5 12 13 20 35 37 42 46 48 51 52 57 63 64 65 11 13 12 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 What if the stream is out-of-order? 5 49 13 12 5
  • 56. Impact of Workload Characteristics (Example) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 17 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Event Time 5 12 13 20 35 37 42 46 48 51 52 57 63 64 65 11 13 12 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 What if the stream is out-of-order? 5 49 13 125 + - 3 5
  • 57. Impact of Workload Characteristics (Example) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 17 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Event Time 5 12 13 20 35 37 42 46 48 51 52 57 63 64 65 11 13 12 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 What if the stream is out-of-order? 5 49 13 123 1+ -5 + - 3 5
  • 58. Impact of Workload Characteristics (Example) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 17 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Event Time 5 12 13 20 35 37 42 46 48 51 52 57 63 64 65 11 13 12 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 What if the stream is out-of-order? 5 49 13 123 1+ -5 + - 3 5 What if the aggregation function is not invertible?
  • 59. In-order Processing with Context Free Windows 27.03.2019 Efficient Window Aggregation with General Stream Slicing 18
  • 60. In-order Processing with Context Free Windows 27.03.2019 Efficient Window Aggregation with General Stream Slicing 18 Slicing techniques scale to large numbers of concurrent windows.
  • 61. Impact of Stream Order 27.03.2019 Efficient Window Aggregation with General Stream Slicing 19
  • 62. Impact of Stream Order 27.03.2019 Efficient Window Aggregation with General Stream Slicing 19 Slicing techniques are robust against out-of-order tuples.
  • 63. Impact of Aggregation Functions (20% out-of-order) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 20
  • 64. Impact of Aggregation Functions (20% out-of-order) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 20 Stream Slicing performs well on many different kinds of aggregation functions.
  • 65. Efficient Window Aggregation with General Stream Slicing 27.03.2019 Efficient Window Aggregation with General Stream Slicing 21
  • 66. Efficient Window Aggregation with General Stream Slicing • We identify workload characteristics which impact applicability and performance of window aggregation techniques. 27.03.2019 Efficient Window Aggregation with General Stream Slicing 21
  • 67. Efficient Window Aggregation with General Stream Slicing • We identify workload characteristics which impact applicability and performance of window aggregation techniques. • We present a generally applicable and highly efficient solution for streaming window aggregation. 27.03.2019 Efficient Window Aggregation with General Stream Slicing 21
  • 68. Efficient Window Aggregation with General Stream Slicing • We identify workload characteristics which impact applicability and performance of window aggregation techniques. • We present a generally applicable and highly efficient solution for streaming window aggregation. • We show that general stream slicing is generally applicable and offers better performance than alternative approaches. 27.03.2019 Efficient Window Aggregation with General Stream Slicing 21
  • 69. Efficient Window Aggregation with General Stream Slicing • We identify workload characteristics which impact applicability and performance of window aggregation techniques. • We present a generally applicable and highly efficient solution for streaming window aggregation. • We show that general stream slicing is generally applicable and offers better performance than alternative approaches. 27.03.2019 Efficient Window Aggregation with General Stream Slicing 21 tu-berlin-dima.github.io/scotty-window-processor Open Source Repository: