O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.
2nd BMBF Big Data All Hands Meeting and
2nd Smart Data Innovation Conference
Karlsruhe , October 11.-12., 2017
Presenting ...
The Growth of the Internet of Things
Gartner says 6.4 billion connected
"Things" will be in use in 2016 and
more than 20 b...
Goal
Provide real-time insights based on IoT data.
Jonas Traub – TU Berlin / DFKI – Efficiently Handling Streams from Mill...
Problem
• Billions of devices provide real-time data
• Result: Vast amount of data streams
Heavy Network Utilization Scala...
Solution
Produce and process data streams
based on the data demand of applications.
Jonas Traub – TU Berlin / DFKI – Effic...
State of the Art Approach
Data Stream Production with Periodic Sampling
Major Challenges:
• Oversampling
• Missing Adaptiv...
Solution
On-Demand Data Streaming from Sensor Nodes
Optimized On-Demand Data
Streaming from Sensor Nodes
Jonas Traub – TU ...
State of the Art Approach
Provide all Data to Front-End Applications
Optimized On-Demand Data
Streaming from Sensor Nodes
...
Solution
Adaptive Data Reduction with Streaming Engines
Optimized On-Demand Data
Streaming from Sensor Nodes
I²: Interacti...
Solution
Adaptive Data Reduction with Streaming
Engines
Optimized On-Demand Data
Streaming from Sensor Nodes
I²: Interacti...
Solution
Efficient Processing of user-defined Windows
Optimized On-Demand Data
Streaming from Sensor Nodes
I²: Interactive...
Publications
Jonas Traub – TU Berlin / DFKI – Efficiently Handling Streams from Millions of Sensors 12
Optimized On-Demand Data Streaming
from Sensor Nodes
Jonas Traub, Sebastian Breß, Asterios Katsifodimos, Tilmann Rabl, Vol...
Architecture Overview
14
Jonas Traub et al. – Optimized On-Demand Data Streaming from Sensor Nodes – ACM SoCC 2017
Architecture Overview
14
Jonas Traub et al. – Optimized On-Demand Data Streaming from Sensor Nodes – ACM SoCC 2017
Architecture Overview
14
Jonas Traub et al. – Optimized On-Demand Data Streaming from Sensor Nodes – ACM SoCC 2017
Architecture Overview
14
Jonas Traub et al. – Optimized On-Demand Data Streaming from Sensor Nodes – ACM SoCC 2017
Architecture Overview
14
Jonas Traub et al. – Optimized On-Demand Data Streaming from Sensor Nodes – ACM SoCC 2017
User-Defined Sampling Functions
19
• Provide an abstraction to define the data demand of applications.
• Upon a sensor rea...
User-Defined Sampling Functions
20
• Provide an abstraction to define the data demand of applications.
• Upon a sensor rea...
User-Defined Sampling Functions
21
Enable adaptive sampling techniques to reduce data transmission
e.g., Adam [Trihinas ‘1...
Sensor Read Fusion
22
Jonas Traub et al. – Optimized On-Demand Data Streaming from Sensor Nodes – ACM SoCC 2017
Sensor Read Fusion
23
1) Minimize Sensor Reads and Data Transfer:
Latest possible read time
2) Optimize Sensor Read Times:...
24
Jonas Traub et al. – Optimized On-Demand Data Streaming from Sensor Nodes – ACM SoCC 2017
Local Filtering
25
Jonas Traub et al. – Optimized On-Demand Data Streaming from Sensor Nodes – ACM SoCC 2017
Optimized On-Demand Data Streaming
from Sensor Nodes
Wrap-Up:
Tailor Data Streams to the Demand of Applications
• Define d...
Cutty: Aggregate Sharing
for User-Defined Windows
Paris Cabone, Jonas Traub, Asterios Katsifodimos, Seif Haridi, Volker Ma...
Streaming Window Aggragation
Paris Carbone et al. – Cutty: Aggregate Sharing for User-Defined Windows – CIKM 2017 28
Stream Slicing
Paris Carbone et al. – Cutty: Aggregate Sharing for User-Defined Windows – CIKM 2017 29
Applicability of Stream Slicing
Paris Carbone et al. – Cutty: Aggregate Sharing for User-Defined Windows – CIKM 2017 30
Yes, we can do better!
Paris Carbone et al. – Cutty: Aggregate Sharing for User-Defined Windows – CIKM 2017 31
Cutty Overview
Paris Carbone et al. – Cutty: Aggregate Sharing for User-Defined Windows – CIKM 2017 32
Cutty: Aggregate Sharing
for User-Defined Windows
Wrap-Up:
Enable Stream Slicing beyond Simple Tumbling and Sliding Window...
I²: Interactive Real-Time Visualization
for Streaming Data
Jonas Traub, Nikolaas Steenbergen, Philipp Grulich, Tilmann Rab...
Architecture Overview
Jonas Traub et al. – I²: Interactive Real-Time Visualization for Streaming Data – EDBT 2017 35
Check out our Flink Forward Talk
youtube.com/watch?v=JNbq239JkK4
36
The Big Picture
Optimized On-Demand Data
Streaming from Sensor Nodes
Traub et al.; ACM SoCC’17
I²: Interactive Real-Time V...
Próximos SlideShares
Carregando em…5
×

Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

149 visualizações

Publicada em

The 2nd Big Data All Hands Meeting (BDAHM) is held in conjunction with the 2nd Smart Data Innovation Conference (SDIC).

This talk about "Efficiently Handling Streams from Millions of Sensors" provides an introduction to three recent research works dealing with massive sensor data streams in the age of the Internet of Things.

We present three publications and show in a big picture how they enable dealing with potentially billions of IoT sensors.

1)Real-time sensor data enables diverse applications such as smart metering, traffic monitoring, and sport analysis. In the Internet of Things, billions of sensor nodes form a sensor cloud and offer data streams to analysis systems. However, it is impossible to transfer all available data with maximal frequencies to all applications. Therefore, we need to tailor data streams to the demand of applications. We contribute a technique that optimizes communication costs while maintaining the desired accuracy. Our technique schedules reads across huge amounts of sensors based on the data-demands of a huge amount of concurrent queries. We introduce user-defined sampling functions that define the data-demand of queries and facilitate various adaptive sampling techniques, which decrease the amount of transferred data. Moreover, we share sensor reads and data transfers among queries. Our experiments with real-world data show that our approach saves up to 87% in data transmissions.

2) We present Cutty, an innovative technique for the efficient aggregation of user-defined windows over data streams. While the aggregation of periodic sliding and tumbling windows was extensively studied in the past, little to no work was done on optimizing the aggregation of common, non-periodic windows. Typical examples of non-periodic windows are punctuation windows and sessions which can implement complex business logic. Cutty performs aggregate sharing for data stream windows, which are declared as user-defined functions (UDFs) and can contain arbitrary business logic. Cutty outperforms the state of the art for aggregate sharing on single and multiple queries. Moreover, it enables aggregate sharing for a broad class of non-periodic UDWs.

3) We present I², an interactive development environment for real-time analysis pipelines, which is based on Apache Flink and Apache Zeppelin. The sheer amount of available streaming data frequently makes it impossible to visualize all data points at the same time. I² coordinates running cluster applications and corresponding visualizations such that only the currently depicted data points are processed in Flink and transferred towards the front end. We show how Flink jobs can adapt to changed visualization properties at runtime to allow interactive data exploration on high bandwidth data streams. Moreover, we present a data reduction technique which minimizes data transfer while providing loss free time-series plots.

Publicada em: Dados e análise
  • Seja o primeiro a comentar

Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

  1. 1. 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference Karlsruhe , October 11.-12., 2017 Presenting at Efficiently Handling Streams from Millions of Sensors Jonas Traub – TU Berlin / DFKI 1
  2. 2. The Growth of the Internet of Things Gartner says 6.4 billion connected "Things" will be in use in 2016 and more than 20 billion in 2020. Year # Devices (in billions) Jonas Traub – TU Berlin / DFKI – Efficiently Handling Streams from Millions of Sensors 2
  3. 3. Goal Provide real-time insights based on IoT data. Jonas Traub – TU Berlin / DFKI – Efficiently Handling Streams from Millions of Sensors 3
  4. 4. Problem • Billions of devices provide real-time data • Result: Vast amount of data streams Heavy Network Utilization Scalability Challenges Increasing Latencies Financial Costs Jonas Traub – TU Berlin / DFKI – Efficiently Handling Streams from Millions of Sensors 4
  5. 5. Solution Produce and process data streams based on the data demand of applications. Jonas Traub – TU Berlin / DFKI – Efficiently Handling Streams from Millions of Sensors 5
  6. 6. State of the Art Approach Data Stream Production with Periodic Sampling Major Challenges: • Oversampling • Missing Adaptivity Jonas Traub – TU Berlin / DFKI – Efficiently Handling Streams from Millions of Sensors 6
  7. 7. Solution On-Demand Data Streaming from Sensor Nodes Optimized On-Demand Data Streaming from Sensor Nodes Jonas Traub – TU Berlin / DFKI – Efficently Handling Streams from Millions of Sensors 7
  8. 8. State of the Art Approach Provide all Data to Front-End Applications Optimized On-Demand Data Streaming from Sensor Nodes Major Challenge: • Front End Overload Jonas Traub – TU Berlin / DFKI – Efficently Handling Streams from Millions of Sensors 8
  9. 9. Solution Adaptive Data Reduction with Streaming Engines Optimized On-Demand Data Streaming from Sensor Nodes I²: Interactive Real-Time Visualization for Streaming Data Jonas Traub – TU Berlin / DFKI – Efficiently Handling Streams from Millions of Sensors 9
  10. 10. Solution Adaptive Data Reduction with Streaming Engines Optimized On-Demand Data Streaming from Sensor Nodes I²: Interactive Real-Time Visualization for Streaming Data Jonas Traub – TU Berlin / DFKI – Efficiently Handling Streams from Millions of Sensors 10
  11. 11. Solution Efficient Processing of user-defined Windows Optimized On-Demand Data Streaming from Sensor Nodes I²: Interactive Real-Time Visualization for Streaming Data Cutty: Aggregate Sharing for User-Defined Windows Jonas Traub – TU Berlin / DFKI – Efficiently Handling Streams from Millions of Sensors 11
  12. 12. Publications Jonas Traub – TU Berlin / DFKI – Efficiently Handling Streams from Millions of Sensors 12
  13. 13. Optimized On-Demand Data Streaming from Sensor Nodes Jonas Traub, Sebastian Breß, Asterios Katsifodimos, Tilmann Rabl, Volker Markl Santa Clara, California, September 25-27, 2017 13
  14. 14. Architecture Overview 14 Jonas Traub et al. – Optimized On-Demand Data Streaming from Sensor Nodes – ACM SoCC 2017
  15. 15. Architecture Overview 14 Jonas Traub et al. – Optimized On-Demand Data Streaming from Sensor Nodes – ACM SoCC 2017
  16. 16. Architecture Overview 14 Jonas Traub et al. – Optimized On-Demand Data Streaming from Sensor Nodes – ACM SoCC 2017
  17. 17. Architecture Overview 14 Jonas Traub et al. – Optimized On-Demand Data Streaming from Sensor Nodes – ACM SoCC 2017
  18. 18. Architecture Overview 14 Jonas Traub et al. – Optimized On-Demand Data Streaming from Sensor Nodes – ACM SoCC 2017
  19. 19. User-Defined Sampling Functions 19 • Provide an abstraction to define the data demand of applications. • Upon a sensor read, request the next sensor read. Jonas Traub et al. – Optimized On-Demand Data Streaming from Sensor Nodes – ACM SoCC 2017
  20. 20. User-Defined Sampling Functions 20 • Provide an abstraction to define the data demand of applications. • Upon a sensor read, request the next sensor read. • Make read time tolerances explicit. Jonas Traub et al. – Optimized On-Demand Data Streaming from Sensor Nodes – ACM SoCC 2017
  21. 21. User-Defined Sampling Functions 21 Enable adaptive sampling techniques to reduce data transmission e.g., Adam [Trihinas ‘15], FAST [Fan ‘14], L-SIP [Gaura ’13] Jonas Traub et al. – Optimized On-Demand Data Streaming from Sensor Nodes – ACM SoCC 2017
  22. 22. Sensor Read Fusion 22 Jonas Traub et al. – Optimized On-Demand Data Streaming from Sensor Nodes – ACM SoCC 2017
  23. 23. Sensor Read Fusion 23 1) Minimize Sensor Reads and Data Transfer: Latest possible read time 2) Optimize Sensor Read Times: ● Check the paper for all details on the read time optimizer! Jonas Traub et al. – Optimized On-Demand Data Streaming from Sensor Nodes – ACM SoCC 2017
  24. 24. 24 Jonas Traub et al. – Optimized On-Demand Data Streaming from Sensor Nodes – ACM SoCC 2017
  25. 25. Local Filtering 25 Jonas Traub et al. – Optimized On-Demand Data Streaming from Sensor Nodes – ACM SoCC 2017
  26. 26. Optimized On-Demand Data Streaming from Sensor Nodes Wrap-Up: Tailor Data Streams to the Demand of Applications • Define data demand: User-Defined Sampling Functions • Schedule sensor reads and data transfer on-demand • Optimize read times globally - for all users and queries Jonas Traub, Sebastian Breß, Asterios Katsifodimos, Tilmann Rabl, Volker Markl 26
  27. 27. Cutty: Aggregate Sharing for User-Defined Windows Paris Cabone, Jonas Traub, Asterios Katsifodimos, Seif Haridi, Volker Markl 27
  28. 28. Streaming Window Aggragation Paris Carbone et al. – Cutty: Aggregate Sharing for User-Defined Windows – CIKM 2017 28
  29. 29. Stream Slicing Paris Carbone et al. – Cutty: Aggregate Sharing for User-Defined Windows – CIKM 2017 29
  30. 30. Applicability of Stream Slicing Paris Carbone et al. – Cutty: Aggregate Sharing for User-Defined Windows – CIKM 2017 30
  31. 31. Yes, we can do better! Paris Carbone et al. – Cutty: Aggregate Sharing for User-Defined Windows – CIKM 2017 31
  32. 32. Cutty Overview Paris Carbone et al. – Cutty: Aggregate Sharing for User-Defined Windows – CIKM 2017 32
  33. 33. Cutty: Aggregate Sharing for User-Defined Windows Wrap-Up: Enable Stream Slicing beyond Simple Tumbling and Sliding Windows • Cutty enables Stream Slicing for a broad class of windows • Cutty combines Stream Slicing, On-the-fly Aggregation, Aggregate Sharing, and Aggregate Trees Paris Cabone, Jonas Traub, Asterios Katsifodimos, Seif Haridi, Volker Markl 33
  34. 34. I²: Interactive Real-Time Visualization for Streaming Data Jonas Traub, Nikolaas Steenbergen, Philipp Grulich, Tilmann Rabl, Volker Markl 34
  35. 35. Architecture Overview Jonas Traub et al. – I²: Interactive Real-Time Visualization for Streaming Data – EDBT 2017 35
  36. 36. Check out our Flink Forward Talk youtube.com/watch?v=JNbq239JkK4 36
  37. 37. The Big Picture Optimized On-Demand Data Streaming from Sensor Nodes Traub et al.; ACM SoCC’17 I²: Interactive Real-Time Visualization for Streaming Data Traub et al.; EDBT’17 Cutty: Aggregate Sharing for User-Defined Windows Carbone et al.; CIKM’16 Jonas Traub – TU Berlin / DFKI – Efficiently Handling Streams from Millions of Sensors

×